digitalmars.D - Strategies for resolving cyclic dependencies in static ctors

Nick Sabalausky (100/100) Mar 21 2011 I'm intending this thread as somewhat of a roundtable-like discussion.

Nick Sabalausky (4/10) Mar 21 2011 ...if anyone sets it to null.
Vladimir Panteleev (7/9) Mar 21 2011 Your post doesn't seem to mention it, but how about converting the stati...

Vladimir Panteleev (6/7) Mar 21 2011 Sorry, didn't scroll down enough :)

Nick Sabalausky (3/7) Mar 21 2011 Well, that is a lot of scrolling, actually :)

spir (31/131) Mar 22 2011 I think the idea of a single static constructor in a main lib import mod...
Max Samukha (65/165) Mar 22 2011 One commonly used hack is to move static constructors into a separate

Jonathan M Davis (4/21) Mar 22 2011 That's what Phobos does to solve the problem (std.stdiobase being only o...
Michel Fortin (53/79) Mar 22 2011 I don't know why people keep repeating that falacy. This statement is

Max Samukha (8/33) Mar 22 2011 It's not people, only me. It's embarrassing. I deserve to be processed

Jacob Carlborg (6/15) Mar 22 2011 One idea could be, although very platform dependent, to iterate the
Steven Schveighoffer (29/42) Mar 22 2011 What one can try is to factor out the initialization code into a separat...

Graham St Jack (11/57) Mar 22 2011 My own solution to this "problem" is to never have circular imports at

Nick Sabalausky (8/17) Mar 22 2011 That's certainly good in many cases, but I find there are many times whe...

Graham St Jack (5/24) Mar 22 2011 I'm happy to admit that these cases could come up, but I have never yet

Don (5/35) Mar 23 2011 I wish Phobos didn't have any circular dependencies. Unfortunately,

spir (13/15) Mar 23 2011 Maybe it depends on your app domain or whatnot; there are lots of cases,...

Nick Sabalausky (21/34) Mar 23 2011 Funny, I had a couple parsing examples in mind, too: If you have a

Graham St Jack (27/27) Mar 23 2011 Regarding unit tests - I have never been a fan of putting unit test code...

Jonathan M Davis (6/34) Mar 23 2011 Obviously, it wouldn't resolve all of your concerns, but I would point o...

Graham St Jack (6/40) Mar 23 2011 That is a good point, but as you say, it doesn't address all the concern...

Jonathan M Davis (44/86) Mar 23 2011 Personally, I find the unit tests to be _way_ more maintainable when the...
spir (32/37) Mar 24 2011 My position is intermediate between ones of Jonathan and Graham: I want
Nick Sabalausky (12/14) Mar 24 2011 If it weren't for the unittests working the way they do, I probably neve...

Steven Schveighoffer (48/74) Mar 24 2011 As Jonathan says, version(unittest) works. No need to bloat unnecessari...

Graham St Jack (50/127) Mar 24 2011 Agreed. However, all the circularity problems pop up when you compile

Steven Schveighoffer (62/134) Mar 25 2011 This might be true in some cases, yes. It depends on how much a unit te...

Graham St Jack (6/6) Mar 27 2011 I sounds like we actually agree with each other on all the important

"Nick Sabalausky" <a a.a> writes:

I'm intending this thread as somewhat of a roundtable-like discussion. 
Hopefully we can come up with good material for a short article on Wiki4D, 
or maybe the D website, or wherever.

The scenario: A coder is writing some D, compiles, runs and gets a "Cyclic 
dependency in static ctors" error. Crap! A pain for experienced D users, and 
very bad PR for new D users. (Hopefully we'll eventually get some sort of 
solution for this, but in the meantime D users just have to deal with it.)

The question: What now? What strategies do people find useful for dealing 
with this? Any specific "first steps" to take? Best practices? Etc.

Aside from the old "start merging modules" PITA that many of us are familiar 
with from the old "100 forward reference errors at the drop of a hat" days, 
I've found one viable (but still kinda PITA) strategy so far:

1. Look at the line that says: "module foo -> module bar -> module foo"

2. Pick one of those two modules (whichever has the simplest static ctors) 
and eliminate all the static ctors using the following pattern:

The pattern: The trick is to convert every variable that needs to be 
initialized into an "init on first use" ref  property. This "ref  property" 
checks a "hasThisBeenInited" bool and, if false, runs all the 
initialization. If the variable is a reference type, then *sometimes* you 
can get away with just checking for null (though often not, because it will 
just get-reinited ).

Example:
------------------------------
// Old:
class Foo { /+...+/ }
int number;
Foo foo;
static this
{
    foo = new Foo();
    number = /+ something maybe involving foo +/;
}

// New:
class Foo { /+...+/ }

private int _number;
 property ref int number()
{
    forceModuleInit();
    return _number;
}

private Foo _foo;
 property ref Foo foo()
{
    forceModuleInit();
    return _foo;
}

bool isModuleInited=false;
static void forceModuleInit() // Hopefully inlined
{
    if(!isModuleInited)
    {
        staticThis();
        isModuleInited = true;
    }
}
static void staticThis() // Workaround for static ctor cycle
{
    foo = new Foo();
    number = /+ something maybe involving foo +/;
}
------------------------------

If one of the variables being inited in the static ctor is something you 
*know* will never be accessed before some other specific variable is 
accessed, then you can skip converting it to " property ref" if you want.

It's a big mess, but the conversion can be done deterministically, and even 
mechanically (heck, a ctfe string mixin wrapper could probably be built to 
do it).

The potential downsides:

1. If you come across an  propery bug or limitation, you're SOL. This should 
become less and less of an issue with time, though.

2. If one of the variables you converted is frequently-accessed, it could 
cause a performance problem.

3. Small increase to storage requirements. Might potentially be a problem if 
it's within templated or mixed-in code that gets instantiated many times.

At one point, I fiddled around with the idea of converting static ctors to 
"staticThis()" and then having one real static ctor for the entire library 
(assuming it's a library) that manually calls all the staticThis functions. 
One problem with this is that it's easy to accidentally forget to call one 
of the staticThis functions. The other big problem I found with this though, 
especially for a library, is that it requires everyone importing your code 
to always import through a single "import foo.all" module. If some user 
skips that, then the static ctors won't get run. There might be some 
possible workarounds for that, though:

- If the library has some primary interface that always gets used, then that 
can easily check if the static ctors have run and error out if not. If the 
primary interface is *always* the first part of your library used (or at 
least the first of all the parts that actually rely on the static ctors 
having run), then you could even run the static ctors right then instead of 
erroring out. That's a lot of "if"s, though, so it may not be 
widely-applicable.

- If you convert *all* static ctors to staticThis(), it might be possible to 
stick the one main static ctor into a private utility module that gets 
privately imported by all modules in the library. Then users can continue 
importing whatever module(s) they want. But if you don't convert *all* of 
the static ctors to staticThis, then you'll just re-introduce a cycle.

But if there's ever two separate libraries that have any interdependencies, 
then the one-main-real static ctor (that calls all the staticThis() funcs) 
will have to be shared between the two libraries. So overall, this approach 
may be possible, but maybe only in certain cases, and can involve a lot of 
changes.

Mar 21 2011

"Nick Sabalausky" <a a.a> writes:

"Nick Sabalausky" <a a.a> wrote in message 
news:im8pmp$18p7$1 digitalmars.com...
 The pattern: The trick is to convert every variable that needs to be 
 initialized into an "init on first use" ref  property. This "ref 
  property" checks a "hasThisBeenInited" bool and, if false, runs all the 
 initialization. If the variable is a reference type, then *sometimes* you 
 can get away with just checking for null (though often not, because it 
 will just get-reinited ).

...if anyone sets it to null.

(Forgot to finish that last sentence.)

Mar 21 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Tue, 22 Mar 2011 02:12:55 +0200, Nick Sabalausky <a a.a> wrote:

 The question: What now? What strategies do people find useful for dealing
 with this? Any specific "first steps" to take? Best practices? Etc.

Your post doesn't seem to mention it, but how about converting the static  
ctors to initialization functions, and calling them from a single static  
ctor within the dependency loop?

-- 
Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net

Mar 21 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Tue, 22 Mar 2011 04:30:37 +0200, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 Your post doesn't seem to mention it,

Sorry, didn't scroll down enough :)

-- 
Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net

Mar 21 2011

"Nick Sabalausky" <a a.a> writes:

"Vladimir Panteleev" <vladimir thecybershadow.net> wrote in message 
news:op.vsp3zooituzx1w cybershadow.mshome.net...
 On Tue, 22 Mar 2011 04:30:37 +0200, Vladimir Panteleev 
 <vladimir thecybershadow.net> wrote:

 Your post doesn't seem to mention it,

 Sorry, didn't scroll down enough :)

Well, that is a lot of scrolling, actually :)

Mar 21 2011

spir <denis.spir gmail.com> writes:

On 03/22/2011 01:12 AM, Nick Sabalausky wrote:
 I'm intending this thread as somewhat of a roundtable-like discussion.
 Hopefully we can come up with good material for a short article on Wiki4D,
 or maybe the D website, or wherever.

 The scenario: A coder is writing some D, compiles, runs and gets a "Cyclic
 dependency in static ctors" error. Crap! A pain for experienced D users, and
 very bad PR for new D users. (Hopefully we'll eventually get some sort of
 solution for this, but in the meantime D users just have to deal with it.)

 The question: What now? What strategies do people find useful for dealing
 with this? Any specific "first steps" to take? Best practices? Etc.

 Aside from the old "start merging modules" PITA that many of us are familiar
 with from the old "100 forward reference errors at the drop of a hat" days,
 I've found one viable (but still kinda PITA) strategy so far:

 1. Look at the line that says: "module foo ->  module bar ->  module foo"

 2. Pick one of those two modules (whichever has the simplest static ctors)
 and eliminate all the static ctors using the following pattern:

 The pattern: The trick is to convert every variable that needs to be
 initialized into an "init on first use" ref  property. This "ref  property"
 checks a "hasThisBeenInited" bool and, if false, runs all the
 initialization. If the variable is a reference type, then *sometimes* you
 can get away with just checking for null (though often not, because it will
 just get-reinited ).

 Example:
 ------------------------------
 // Old:
 class Foo { /+...+/ }
 int number;
 Foo foo;
 static this
 {
      foo = new Foo();
      number = /+ something maybe involving foo +/;
 }

 // New:
 class Foo { /+...+/ }

 private int _number;
  property ref int number()
 {
      forceModuleInit();
      return _number;
 }

 private Foo _foo;
  property ref Foo foo()
 {
      forceModuleInit();
      return _foo;
 }

 bool isModuleInited=false;
 static void forceModuleInit() // Hopefully inlined
 {
      if(!isModuleInited)
      {
          staticThis();
          isModuleInited = true;
      }
 }
 static void staticThis() // Workaround for static ctor cycle
 {
      foo = new Foo();
      number = /+ something maybe involving foo +/;
 }
 ------------------------------

 If one of the variables being inited in the static ctor is something you
 *know* will never be accessed before some other specific variable is
 accessed, then you can skip converting it to " property ref" if you want.

 It's a big mess, but the conversion can be done deterministically, and even
 mechanically (heck, a ctfe string mixin wrapper could probably be built to
 do it).

 The potential downsides:

 1. If you come across an  propery bug or limitation, you're SOL. This should
 become less and less of an issue with time, though.

 2. If one of the variables you converted is frequently-accessed, it could
 cause a performance problem.

 3. Small increase to storage requirements. Might potentially be a problem if
 it's within templated or mixed-in code that gets instantiated many times.

 At one point, I fiddled around with the idea of converting static ctors to
 "staticThis()" and then having one real static ctor for the entire library
 (assuming it's a library) that manually calls all the staticThis functions.
 One problem with this is that it's easy to accidentally forget to call one
 of the staticThis functions. The other big problem I found with this though,
 especially for a library, is that it requires everyone importing your code
 to always import through a single "import foo.all" module. If some user
 skips that, then the static ctors won't get run. There might be some
 possible workarounds for that, though:

 - If the library has some primary interface that always gets used, then that
 can easily check if the static ctors have run and error out if not. If the
 primary interface is *always* the first part of your library used (or at
 least the first of all the parts that actually rely on the static ctors
 having run), then you could even run the static ctors right then instead of
 erroring out. That's a lot of "if"s, though, so it may not be
 widely-applicable.

 - If you convert *all* static ctors to staticThis(), it might be possible to
 stick the one main static ctor into a private utility module that gets
 privately imported by all modules in the library. Then users can continue
 importing whatever module(s) they want. But if you don't convert *all* of
 the static ctors to staticThis, then you'll just re-introduce a cycle.

 But if there's ever two separate libraries that have any interdependencies,
 then the one-main-real static ctor (that calls all the staticThis() funcs)
 will have to be shared between the two libraries. So overall, this approach
 may be possible, but maybe only in certain cases, and can involve a lot of
 changes.

I think the idea of a single static constructor in a main lib import module 
(let's call it lib.d), calling staticThis func in every module, is the right 
track. First, it is rather a good practice (I mean both from the designer and 
user points of view).
Upon "it's easy to accidentally forget to call one of the staticThis 
functions": just systematically write one in every module, possibly empty at 
start. Then, from the main module, call staticThis on every imported module. 
This can be a cascade: staticThis if M calls staticThis of its own private 
dependencies. Shared submodules should still be init from lib, but even double 
init would be cheap thank the the bool flag.

Upon the case where users may need and import only part of your lib, then I 
guess obviously this mean some kind of *independency*, doesn't it? Else, all 
must be init-ed, and thus imported, anyway, don't you think? In which case they 
could as well import the main module in every case, and use only what they need.

Upon the last issue of common dependancies. If the problem of cyclic 
dependencies in static ctors is analog to circular imports, then there are 2 
common strategies (here considering 2 modules importing each other):
* Isolate a part of a module that requires its own module and another one, 
place it in a 3rd module which imports both. (This is a common issue for test 
cases.)
* Conversely, isolate a part of a module that /is/ required by its own module 
and another one, place it in a 3rd module imported by both. (This is a common 
issue for tool features.)
Strangely enough, this often solves the problem by splitting further instead of 
by merging. Though I don't know whether this is appropriate for your issue.

Denis
-- 
_________________
vita es estrany
spir.wikidot.com

Mar 22 2011

Max Samukha <max spam.box> writes:

On 03/22/2011 02:12 AM, Nick Sabalausky wrote:
 I'm intending this thread as somewhat of a roundtable-like discussion.
 Hopefully we can come up with good material for a short article on Wiki4D,
 or maybe the D website, or wherever.

 The scenario: A coder is writing some D, compiles, runs and gets a "Cyclic
 dependency in static ctors" error. Crap! A pain for experienced D users, and
 very bad PR for new D users. (Hopefully we'll eventually get some sort of
 solution for this, but in the meantime D users just have to deal with it.)

 The question: What now? What strategies do people find useful for dealing
 with this? Any specific "first steps" to take? Best practices? Etc.

One commonly used hack is to move static constructors into a separate 
helper module and call the initialization function via a C extern (like 
it is done in std.stdiobase):

----
module foo_helper;

private extern(C) foo_static_ctor();
static this()
{
     foo_static_ctor();
}

-----

module foo;
import foo_helper;

private Object global;
private extern(C) void foo_static_ctor()
{
     global = new Object;
}
----

Note that "global" is guaranteed to have been initialized when accessed 
from static constructors in modules that import "a". Being able to 
instruct the compiler to do this implicitly (so we could put static 
ctors in templates, for example) would probably solve most of static 
ctor problems.

 Aside from the old "start merging modules" PITA that many of us are familiar
 with from the old "100 forward reference errors at the drop of a hat" days,
 I've found one viable (but still kinda PITA) strategy so far:

 1. Look at the line that says: "module foo ->  module bar ->  module foo"

 2. Pick one of those two modules (whichever has the simplest static ctors)
 and eliminate all the static ctors using the following pattern:

 The pattern: The trick is to convert every variable that needs to be
 initialized into an "init on first use" ref  property. This "ref  property"
 checks a "hasThisBeenInited" bool and, if false, runs all the
 initialization. If the variable is a reference type, then *sometimes* you
 can get away with just checking for null (though often not, because it will
 just get-reinited ).

 Example:
 ------------------------------
 // Old:
 class Foo { /+...+/ }
 int number;
 Foo foo;
 static this
 {
      foo = new Foo();
      number = /+ something maybe involving foo +/;
 }

 // New:
 class Foo { /+...+/ }

 private int _number;
  property ref int number()
 {
      forceModuleInit();
      return _number;
 }

 private Foo _foo;
  property ref Foo foo()
 {
      forceModuleInit();
      return _foo;
 }

 bool isModuleInited=false;
 static void forceModuleInit() // Hopefully inlined
 {
      if(!isModuleInited)
      {
          staticThis();
          isModuleInited = true;
      }
 }
 static void staticThis() // Workaround for static ctor cycle
 {
      foo = new Foo();
      number = /+ something maybe involving foo +/;
 }
 ------------------------------

 If one of the variables being inited in the static ctor is something you
 *know* will never be accessed before some other specific variable is
 accessed, then you can skip converting it to " property ref" if you want.

 It's a big mess, but the conversion can be done deterministically, and even
 mechanically (heck, a ctfe string mixin wrapper could probably be built to
 do it).

 The potential downsides:

 1. If you come across an  propery bug or limitation, you're SOL. This should
 become less and less of an issue with time, though.

 2. If one of the variables you converted is frequently-accessed, it could
 cause a performance problem.

 3. Small increase to storage requirements. Might potentially be a problem if
 it's within templated or mixed-in code that gets instantiated many times.

4. Initializing shared data needs synchronization. Then, your example 
would look similar to this:

------------------------------

class Foo { /+...+/ }

private immutable int _number;
 property ref int number()
{
      forceModuleInit();
      return _number;
}

private immutable Foo _foo;
 property immutable(Foo) foo()
{
      forceModuleInit();
      return _foo;
}

bool isModuleInited=false;
__gshared bool isSharedModuleInited=false;
static void forceModuleInit() // Hopefully inlined
{
      if(!isModuleInited)
      {
          synchronized(someLock)
          {
              if (!isSharedModuleInited)
              {
                  staticThis();
                  isSharedModuleInited = true;
              }
          }
          isModuleInited = true;
      }
}
static void sharedStaticThis() // Workaround for shared static ctor cycle
{
      auto foo = new Foo();
      _number = /+ something maybe involving foo +/;
      _foo = cast(immutable)foo;
}
------------------------------

 At one point, I fiddled around with the idea of converting static ctors to
 "staticThis()" and then having one real static ctor for the entire library
 (assuming it's a library) that manually calls all the staticThis functions.
 One problem with this is that it's easy to accidentally forget to call one
 of the staticThis functions. The other big problem I found with this though,
 especially for a library, is that it requires everyone importing your code
 to always import through a single "import foo.all" module. If some user
 skips that, then the static ctors won't get run. There might be some
 possible workarounds for that, though:

 - If the library has some primary interface that always gets used, then that
 can easily check if the static ctors have run and error out if not. If the
 primary interface is *always* the first part of your library used (or at
 least the first of all the parts that actually rely on the static ctors
 having run), then you could even run the static ctors right then instead of
 erroring out. That's a lot of "if"s, though, so it may not be
 widely-applicable.

 - If you convert *all* static ctors to staticThis(), it might be possible to
 stick the one main static ctor into a private utility module that gets
 privately imported by all modules in the library. Then users can continue
 importing whatever module(s) they want. But if you don't convert *all* of
 the static ctors to staticThis, then you'll just re-introduce a cycle.

 But if there's ever two separate libraries that have any interdependencies,
 then the one-main-real static ctor (that calls all the staticThis() funcs)
 will have to be shared between the two libraries. So overall, this approach
 may be possible, but maybe only in certain cases, and can involve a lot of
 changes.

Mar 22 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

 On 03/22/2011 02:12 AM, Nick Sabalausky wrote:
 I'm intending this thread as somewhat of a roundtable-like discussion.
 Hopefully we can come up with good material for a short article on
 Wiki4D, or maybe the D website, or wherever.
 
 The scenario: A coder is writing some D, compiles, runs and gets a
 "Cyclic dependency in static ctors" error. Crap! A pain for experienced
 D users, and very bad PR for new D users. (Hopefully we'll eventually
 get some sort of solution for this, but in the meantime D users just
 have to deal with it.)
 
 The question: What now? What strategies do people find useful for dealing
 with this? Any specific "first steps" to take? Best practices? Etc.

 
 One commonly used hack is to move static constructors into a separate
 helper module and call the initialization function via a C extern (like
 it is done in std.stdiobase):

That's what Phobos does to solve the problem (std.stdiobase being only one of 
the places that it does it). It's likely the solution that I would use as 
well.

- Jonathan M Davis

Mar 22 2011

Michel Fortin <michel.fortin michelf.com> writes:

On 2011-03-22 05:16:31 -0400, Max Samukha <max spam.box> said:

 ----
 module foo_helper;
 
 private extern(C) foo_static_ctor();
 static this()
 {
      foo_static_ctor();
 }
 
 -----
 
 module foo;
 import foo_helper;
 
 private Object global;
 private extern(C) void foo_static_ctor()
 {
      global = new Object;
 }
 ----
 
 Note that "global" is guaranteed to have been initialized when accessed 
 from static constructors in modules that import "a".

I don't know why people keep repeating that falacy. This statement is 
true only as long as there are no circular dependencies. It should 
read: "'global' is guarentied to have been initialized when access from 
static constructors in module that import 'a' _and_ which are not 
imported by 'a', directly or indirectly."

Because once you introduce a circular dependency, you get this:

----
module foo_helper;

private extern(C) foo_static_ctor();
static this() { foo_static_ctor(); }
-----
module foo;
import foo_helper;
import bar;

private Object global;
private extern(C) void foo_static_ctor()
{
     global = new Object;
     bar.testGlobal();
}
public void testGlobal()
{
     assert(global);
}
----
module bar_helper;

private extern(C) bar_static_ctor();
static this() { bar_static_ctor(); }
-----
module bar;
import bar_helper;
import foo;

private Object global;
private extern(C) void bar_static_ctor()
{
     global = new Object;
     foo.testGlobal();
}
public void testGlobal()
{
     assert(global);
}
----

Note how foo_static_ctor() and bar_static_ctor() each calls a function 
that needs the global variable of the other module to be initialized. 
It should be obvious that it can't work. If you doubt me, try it.


 Being able to instruct the compiler to do this implicitly (so we could 
 put static ctors in templates, for example) would probably solve most 
 of static ctor problems.

It'd have the exact same effect as adding a pragma to bypass the check 
for circular dependencies, while making things more complicated.


-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Mar 22 2011

Max Samukha <max spam.box> writes:

On 03/22/2011 04:14 PM, Michel Fortin wrote:
 On 2011-03-22 05:16:31 -0400, Max Samukha <max spam.box> said:

 ----
 module foo_helper;

 private extern(C) foo_static_ctor();
 static this()
 {
 foo_static_ctor();
 }

 -----

 module foo;
 import foo_helper;

 private Object global;
 private extern(C) void foo_static_ctor()
 {
 global = new Object;
 }
 ----

 Note that "global" is guaranteed to have been initialized when
 accessed from static constructors in modules that import "a".

 I don't know why people keep repeating that falacy.

It's not people, only me. It's embarrassing. I deserve to be processed 
in a bioreactor on a charge of utter incompetency.

Of course, you are absolutely right.

Then why we keep losing last bits of sanity inventing workarounds? The 
problem obviously has no safe solution in the context of D. People have 
been asking for a solution for a long time. There is an obvious need for 
it. So why not just give us that damn pragma?

Mar 22 2011

Jacob Carlborg <doob me.com> writes:

On 2011-03-22 01:12, Nick Sabalausky wrote:
 At one point, I fiddled around with the idea of converting static ctors to
 "staticThis()" and then having one real static ctor for the entire library
 (assuming it's a library) that manually calls all the staticThis functions.
 One problem with this is that it's easy to accidentally forget to call one
 of the staticThis functions. The other big problem I found with this though,
 especially for a library, is that it requires everyone importing your code
 to always import through a single "import foo.all" module. If some user
 skips that, then the static ctors won't get run. There might be some
 possible workarounds for that, though:

One idea could be, although very platform dependent, to iterate the 
symbol table, finding all "staticThis" functions and calling them. Then 
you don't have the problem of forgetting to call one of those functions.

-- 
/Jacob Carlborg

Mar 22 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 21 Mar 2011 20:12:55 -0400, Nick Sabalausky <a a.a> wrote:

 I'm intending this thread as somewhat of a roundtable-like discussion.
 Hopefully we can come up with good material for a short article on  
 Wiki4D,
 or maybe the D website, or wherever.

 The scenario: A coder is writing some D, compiles, runs and gets a  
 "Cyclic
 dependency in static ctors" error. Crap! A pain for experienced D users,  
 and
 very bad PR for new D users. (Hopefully we'll eventually get some sort of
 solution for this, but in the meantime D users just have to deal with  
 it.)

 The question: What now? What strategies do people find useful for dealing
 with this? Any specific "first steps" to take? Best practices? Etc.

What one can try is to factor out the initialization code into a separate  
module.  Essentially if you have:

module a;
import f : someFunction;
import b; // conflicts because of circular dependencies

int aglobal;

static this()
{
   aglobal = someFunction();
}

You can do something like:

module a_static;
import f : someFunction;

int aglobal;

static this()
{
    aglobal = someFunction();
}

in a.d:
module a;
public import a_static;
import b;

Of course, there can be reprecussions -- you may need to have aglobal  
declared in a.d.  In those cases, one can try to hide the cycle as Max  
Samuckha stated, but I'd rather see a compiler option than these kinds of  
workarounds.  The workarounds can be unobvious, but can be just as  
dangerous.

-Steve

Mar 22 2011

Graham St Jack <Graham.StJack internode.on.net> writes:

On 23/03/11 03:41, Steven Schveighoffer wrote:
 On Mon, 21 Mar 2011 20:12:55 -0400, Nick Sabalausky <a a.a> wrote:

 I'm intending this thread as somewhat of a roundtable-like discussion.
 Hopefully we can come up with good material for a short article on 
 Wiki4D,
 or maybe the D website, or wherever.

 The scenario: A coder is writing some D, compiles, runs and gets a 
 "Cyclic
 dependency in static ctors" error. Crap! A pain for experienced D 
 users, and
 very bad PR for new D users. (Hopefully we'll eventually get some 
 sort of
 solution for this, but in the meantime D users just have to deal with 
 it.)

 The question: What now? What strategies do people find useful for 
 dealing
 with this? Any specific "first steps" to take? Best practices? Etc.

 What one can try is to factor out the initialization code into a 
 separate module.  Essentially if you have:

 module a;
 import f : someFunction;
 import b; // conflicts because of circular dependencies

 int aglobal;

 static this()
 {
   aglobal = someFunction();
 }

 You can do something like:

 module a_static;
 import f : someFunction;

 int aglobal;

 static this()
 {
    aglobal = someFunction();
 }

 in a.d:
 module a;
 public import a_static;
 import b;

 Of course, there can be reprecussions -- you may need to have aglobal 
 declared in a.d.  In those cases, one can try to hide the cycle as Max 
 Samuckha stated, but I'd rather see a compiler option than these kinds 
 of workarounds.  The workarounds can be unobvious, but can be just as 
 dangerous.

 -Steve

My own solution to this "problem" is to never have circular imports at 
all. The build system I use prohibits them, so any careless introduction 
of a circularity is spotted immediately and I refactor the code to 
eliminate the circularity. I have never come across a valid need for 
circularities, and have never had any trouble eliminating any that creep in.

Avoiding circularities has plenty of advantages, like progressive 
development, testing and integration. On bigger projects these 
advantages are very important, and even on small ones they are useful.

-- 
Graham St Jack

Mar 22 2011

"Nick Sabalausky" <a a.a> writes:

"Graham St Jack" <Graham.StJack internode.on.net> wrote in message 
news:imbai9$2jb9$1 digitalmars.com...
 My own solution to this "problem" is to never have circular imports at 
 all. The build system I use prohibits them, so any careless introduction 
 of a circularity is spotted immediately and I refactor the code to 
 eliminate the circularity. I have never come across a valid need for 
 circularities, and have never had any trouble eliminating any that creep 
 in.

 Avoiding circularities has plenty of advantages, like progressive 
 development, testing and integration. On bigger projects these advantages 
 are very important, and even on small ones they are useful.

That's certainly good in many cases, but I find there are many times when a 
"one-way" dependency graph just doesn't fit the given problem and causes 
more trouble than it solves. You often end up needing to re-invent the wheel 
to avoid a dependency, or split/arrange/merge modules in confusing 
unintuitive ways that have more to do with implementation detail than 
high-level purpose.

Mar 22 2011

Graham St Jack <Graham.StJack internode.on.net> writes:

On 23/03/11 15:12, Nick Sabalausky wrote:
 "Graham St Jack"<Graham.StJack internode.on.net>  wrote in message
 news:imbai9$2jb9$1 digitalmars.com...
 My own solution to this "problem" is to never have circular imports at
 all. The build system I use prohibits them, so any careless introduction
 of a circularity is spotted immediately and I refactor the code to
 eliminate the circularity. I have never come across a valid need for
 circularities, and have never had any trouble eliminating any that creep
 in.

 Avoiding circularities has plenty of advantages, like progressive
 development, testing and integration. On bigger projects these advantages
 are very important, and even on small ones they are useful.

 That's certainly good in many cases, but I find there are many times when a
 "one-way" dependency graph just doesn't fit the given problem and causes
 more trouble than it solves. You often end up needing to re-invent the wheel
 to avoid a dependency, or split/arrange/merge modules in confusing
 unintuitive ways that have more to do with implementation detail than
 high-level purpose.

I'm happy to admit that these cases could come up, but I have never yet 
seen one where the design wasn't improved by removing the circularity.


-- 
Graham St Jack

Mar 22 2011

Don <nospam nospam.com> writes:

Graham St Jack wrote:
 On 23/03/11 15:12, Nick Sabalausky wrote:
 "Graham St Jack"<Graham.StJack internode.on.net>  wrote in message
 news:imbai9$2jb9$1 digitalmars.com...
 My own solution to this "problem" is to never have circular imports at
 all. The build system I use prohibits them, so any careless introduction
 of a circularity is spotted immediately and I refactor the code to
 eliminate the circularity. I have never come across a valid need for
 circularities, and have never had any trouble eliminating any that creep
 in.

 Avoiding circularities has plenty of advantages, like progressive
 development, testing and integration. On bigger projects these 
 advantages
 are very important, and even on small ones they are useful.

 That's certainly good in many cases, but I find there are many times 
 when a
 "one-way" dependency graph just doesn't fit the given problem and causes
 more trouble than it solves. You often end up needing to re-invent the 
 wheel
 to avoid a dependency, or split/arrange/merge modules in confusing
 unintuitive ways that have more to do with implementation detail than
 high-level purpose.

 I'm happy to admit that these cases could come up, but I have never yet 
 seen one where the design wasn't improved by removing the circularity.
 
 

I wish Phobos didn't have any circular dependencies. Unfortunately, 
there are almost no modules which aren't in a loop (Basically, anything 
which imports std.range is a lost cause). There is no doubt that it 
hurts debugging.

Mar 23 2011

spir <denis.spir gmail.com> writes:

On 03/23/2011 12:12 AM, Graham St Jack wrote:
 Avoiding circularities has plenty of advantages, like progressive development,
 testing and integration.

Maybe it depends on your app domain or whatnot; there are lots of cases, I 
guess, where circularities are inevitable, if not direct expression of the
problem.
Take for instance a set of factories (eg parsing pattern) defined in a M1 
producing reesults (eg parse tree nodes) defined in M2. It's clear that M1 
imports M2. Then, how do you unittest M2? You should import M1... Sure, there 
are various workarounds (creating fake pattern types or objects, exporting the 
tests in a 3rd module...), but they are only this: workarounds.

Denis
-- 
_________________
vita es estrany
spir.wikidot.com

Mar 23 2011

"Nick Sabalausky" <a a.a> writes:

"spir" <denis.spir gmail.com> wrote in message 
news:mailman.2690.1300879902.4748.digitalmars-d puremagic.com...
 On 03/23/2011 12:12 AM, Graham St Jack wrote:
 Avoiding circularities has plenty of advantages, like progressive 
 development,
 testing and integration.

 Maybe it depends on your app domain or whatnot; there are lots of cases, I 
 guess, where circularities are inevitable, if not direct expression of the 
 problem.
 Take for instance a set of factories (eg parsing pattern) defined in a M1 
 producing reesults (eg parse tree nodes) defined in M2. It's clear that M1 
 imports M2. Then, how do you unittest M2? You should import M1... Sure, 
 there are various workarounds (creating fake pattern types or objects, 
 exporting the tests in a 3rd module...), but they are only this: 
 workarounds.

Funny, I had a couple parsing examples in mind, too: If you have a 
general-purpose (ie, grammar-agnostic) parsing tool, then it may make sense 
for the parse tree nodes (ine one module) to know what Language (from 
another module) they're part of. If it's a grammar-agnostic parsing tool, 
this information can't be encoded in the type. Or, if you have a variety of 
parsing-error-related types (exceptions, for instance), then if they need to 
know what Language they're from, you can't put them in a separate module 
without creating a cycle.

Another thing is string-processing vs general-purpose string-mixin 
utilities: If you have a bunch of CTFE-compatible string-processing 
functions, and a bunch of general-purpose string-mixin-based utilities, it 
makes sense to have them in separate modules. The general-purpose 
string-mixin utilities are almost certainly going to depend on the 
string-processing functions. But if the string-mixin utilities are indeed 
general-purpose, it's likely that some of them may be very useful to the 
string-processing module.

So avoiding cycles can involve some real contortions in certain cases. But I 
do agree they're certainly good to avoid whenever it's reasonable and 
practical to do so.

Mar 23 2011

Graham St Jack <Graham.StJack internode.on.net> writes:

Regarding unit tests - I have never been a fan of putting unit test code 
into the modules being tested because:
* Doing so introduces stacks of unnecessary imports, and bloats the module.
* Executing the unittests happens during execution rather than during 
the build.

All unittests (as in the keyword) seem to have going for them is to be 
an aid to documentation.

What I do instead is put unit tests into separate modules, and use a 
custom build system that compiles, links AND executes the unit test 
modules (when out of date of course). The build fails if a test does not 
pass.

The separation of the test from the code under test has plenty of 
advantages and no down-side that I can see - assuming you use a build 
system that understands the idea. Some of the advantages are:
* No code-bloat or unnecessary imports.
* Much easier to manage inter-module dependencies.
* The tests can be fairly elaborate, and can serve as well-documented 
examples of how to use the code under test.
* Since they only execute during the build, and even then only when out 
of date, they can afford to be more complete tests (ie use plenty of cpu 
time)
* If the code builds, you know all the unit tests pass. No need for a 
special unittest build and manual running of assorted programs to see if 
the tests pass.
* No need for special builds with -unittest turned on.


-- 
Graham St Jack

Mar 23 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

 Regarding unit tests - I have never been a fan of putting unit test code
 into the modules being tested because:
 * Doing so introduces stacks of unnecessary imports, and bloats the module.
 * Executing the unittests happens during execution rather than during
 the build.
 
 All unittests (as in the keyword) seem to have going for them is to be
 an aid to documentation.
 
 What I do instead is put unit tests into separate modules, and use a
 custom build system that compiles, links AND executes the unit test
 modules (when out of date of course). The build fails if a test does not
 pass.
 
 The separation of the test from the code under test has plenty of
 advantages and no down-side that I can see - assuming you use a build
 system that understands the idea. Some of the advantages are:
 * No code-bloat or unnecessary imports.
 * Much easier to manage inter-module dependencies.
 * The tests can be fairly elaborate, and can serve as well-documented
 examples of how to use the code under test.
 * Since they only execute during the build, and even then only when out
 of date, they can afford to be more complete tests (ie use plenty of cpu
 time)
 * If the code builds, you know all the unit tests pass. No need for a
 special unittest build and manual running of assorted programs to see if
 the tests pass.
 * No need for special builds with -unittest turned on.

Obviously, it wouldn't resolve all of your concerns, but I would point out 
that you can use version(unittest) to enclose stuff that's only supposed to be 
in the unit tests build. And that includes using version(unittest) on imports, 
avoiding having to import stuff which is only needed for unit tests during 
normal builds.

- Jonathan M Davis

Mar 23 2011

Graham St Jack <Graham.StJack internode.on.net> writes:

On 24/03/11 15:19, Jonathan M Davis wrote:
 Regarding unit tests - I have never been a fan of putting unit test code
 into the modules being tested because:
 * Doing so introduces stacks of unnecessary imports, and bloats the module.
 * Executing the unittests happens during execution rather than during
 the build.

 All unittests (as in the keyword) seem to have going for them is to be
 an aid to documentation.

 What I do instead is put unit tests into separate modules, and use a
 custom build system that compiles, links AND executes the unit test
 modules (when out of date of course). The build fails if a test does not
 pass.

 The separation of the test from the code under test has plenty of
 advantages and no down-side that I can see - assuming you use a build
 system that understands the idea. Some of the advantages are:
 * No code-bloat or unnecessary imports.
 * Much easier to manage inter-module dependencies.
 * The tests can be fairly elaborate, and can serve as well-documented
 examples of how to use the code under test.
 * Since they only execute during the build, and even then only when out
 of date, they can afford to be more complete tests (ie use plenty of cpu
 time)
 * If the code builds, you know all the unit tests pass. No need for a
 special unittest build and manual running of assorted programs to see if
 the tests pass.
 * No need for special builds with -unittest turned on.

 Obviously, it wouldn't resolve all of your concerns, but I would point out
 that you can use version(unittest) to enclose stuff that's only supposed to be
 in the unit tests build. And that includes using version(unittest) on imports,
 avoiding having to import stuff which is only needed for unit tests during
 normal builds.

 - Jonathan M Davis

That is a good point, but as you say, it doesn't address all the concerns.

I would be interested to hear some success stories for the 
unittest-keyword approach. So far I can't see any up-side.

-- 
Graham St Jack

Mar 23 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

 On 24/03/11 15:19, Jonathan M Davis wrote:
 Regarding unit tests - I have never been a fan of putting unit test code
 into the modules being tested because:
 * Doing so introduces stacks of unnecessary imports, and bloats the
 module. * Executing the unittests happens during execution rather than
 during the build.
 
 All unittests (as in the keyword) seem to have going for them is to be
 an aid to documentation.
 
 What I do instead is put unit tests into separate modules, and use a
 custom build system that compiles, links AND executes the unit test
 modules (when out of date of course). The build fails if a test does not
 pass.
 
 The separation of the test from the code under test has plenty of
 advantages and no down-side that I can see - assuming you use a build
 system that understands the idea. Some of the advantages are:
 * No code-bloat or unnecessary imports.
 * Much easier to manage inter-module dependencies.
 * The tests can be fairly elaborate, and can serve as well-documented
 examples of how to use the code under test.
 * Since they only execute during the build, and even then only when out
 of date, they can afford to be more complete tests (ie use plenty of cpu
 time)
 * If the code builds, you know all the unit tests pass. No need for a
 special unittest build and manual running of assorted programs to see if
 the tests pass.
 * No need for special builds with -unittest turned on.

 
 Obviously, it wouldn't resolve all of your concerns, but I would point
 out that you can use version(unittest) to enclose stuff that's only
 supposed to be in the unit tests build. And that includes using
 version(unittest) on imports, avoiding having to import stuff which is
 only needed for unit tests during normal builds.
 
 - Jonathan M Davis

 
 That is a good point, but as you say, it doesn't address all the concerns.
 
 I would be interested to hear some success stories for the
 unittest-keyword approach. So far I can't see any up-side.

Personally, I find the unit tests to be _way_ more maintainable when they're 
right next to the code. I _really_ like that aspect of how unit tests are done 
in D. However, it does mean that you have to dig through more code to get at 
the actual function definitions (especially if you're thorough with your unit 
tests), and it _does_ increase problems with cyclical dependencies if you need 
static constructors for your unit tests and don't normally have a static 
constructor in the module in question (though you can probably just use an 
extra unittest block at the top of the module to do what the static 
constructor would have done for the unit tests).

I have no problem with having to do a special build for the unit tests. That's 
what I've generally had to do with other unit test frameworks anyway. Also, 
I'd hate for the tests to run as part of the build. I can understand why you 
might want that, but it would really hurt flexibility when debugging unit 
tests. How could you run gdb (or any other debugger) on the unit tests if it 
never actually builds? It's _easy_ to use gdb on unit tests with how unit 
tests currently work in D.

Really, I see only three downsides to how unit tests currently work in D, and 
two of those should be quite fixable.

1. Unit tests don't have names, so it's a royal pain to figure out which test 
an exception escaped from when an exception escapes a unit test. Adding an 
optional syntax like unittest(testname) would solve that problem. It's been 
proposed before, and it's a completely backwards compatible change (as long as 
the names aren't required).

2. Once a unit test fails in a module, none of the remaining unittest blocks 
run. Every unittest in a module should run regardless of whether the previous 
ones succeeded. A particular unittest block should not continue after it has 
had a failure, but the succeeding unittest blocks should be run. This has been 
discussed before and it is intended that it will be how unit tests work 
eventually, but as I understand it, there are changes which must be made to 
the compiler before it can happen.

3. Having the unit tests in the module does make it harder to find the code in 
the module. Personally, I think that the increased ease of maintenance of 
having the unit tests right next to the functions that they go with outweighs 
this problem, but there are plenty of people that complain about the unit 
tests making it harder to sort through the actual code. With a decent editor 
though, it's easy to hop around in the code, skipping unit tests, and you can 
shrink the unit test blocks so that they aren't shown. So, while this is 
certainly a valid concern, I don't think that it's ultimately enough of an 
issue to merit changing how unit tests work in D.

I certainly won't claim that unit tests in D are perfect, but IMHO they are 
far superior to having to deal with an external framework such as JUnit or 
CPPUnit. They also work really well for code maintenance and are so easy to 
use, that not using them is practically a crime.

- Jonathan M Davis

Mar 23 2011

spir <denis.spir gmail.com> writes:

On 03/24/2011 07:44 AM, Jonathan M Davis wrote:
 Personally, I find the unit tests to be _way_ more maintainable when they're
 right next to the code. I _really_ like that aspect of how unit tests are done
 in D. However, it does mean that you have to dig through more code to get at
 the actual function definitions (especially if you're thorough with your unit
 tests),

My position is intermediate between ones of Jonathan and Graham: I want 
unittest to be in the tested module, but cleanly put apart. Typically, one 
module looks like:
1. general tool (imports, tool funcs... systematically present in all modules)
2. specific tools (import, tool funcs & types)
3. proper code of the module
4. tests

Advantages:
* clarity
* what Janathan evokes above: unittests don't clutter code
* tools specific to testing are grouped there
* and the following:

The test section can look like:
* tools (imports, tool funcs, tool types, (possibly random) test data factory)
* a series of "void test*()" funcs
* finally:
unitest {
     testX();
     testY();
     testZ();
     ...
}
void main() {}

 From there, I can control precisely what test(s) will run, according to the 
piece of code I'm currently working on, by simply (un)commenting (out) lines in 
the single unittest section.

Denis
-- 
_________________
vita es estrany
spir.wikidot.com

Mar 24 2011

"Nick Sabalausky" <a a.a> writes:

"Graham St Jack" <Graham.StJack internode.on.net> wrote in message 
news:imem32$o4d$1 digitalmars.com...
 I would be interested to hear some success stories for the 
 unittest-keyword approach. So far I can't see any up-side.

If it weren't for the unittests working the way they do, I probably never 
would have gotten around to using them. And after I started using them, I 
ended up taking those extra steps to make the unittests run in a separate 
program, and to make utilities to work around what I saw as the limitations: 
http://www.dsource.org/projects/semitwist/browser/trunk/src/semitw
st/util/unittests.d 
(In particuler, unittestSection, and the "autoThrow" modification to 
Jonathan's assertPred. The deferAsser/deferEnsure are probably going away, 
superceeded by Jonathan's assertPred.)

So D's unittests working the way they do got me to actually use them. By 
contrast, my Haxe code has very little unittesting.

Mar 24 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Thu, 24 Mar 2011 00:17:03 -0400, Graham St Jack  
<Graham.StJack internode.on.net> wrote:

 Regarding unit tests - I have never been a fan of putting unit test code  
 into the modules being tested because:
 * Doing so introduces stacks of unnecessary imports, and bloats the  
 module.

As Jonathan says, version(unittest) works.  No need to bloat unnecessarily.

 * Executing the unittests happens during execution rather than during  
 the build.

Compile-time code execution is not a good idea for unit tests.  It is  
always more secure and accurate to execute tests in the environment of the  
application, not the compiler.

Besides, this is an implementation detail.  It is easily mitigated.  For  
example, phobos' unit tests can be run simply by doing:

make -f posix.mak unittest

and it builds + runs all unit tests.  This can be viewed as part of the  
"Build process".

 All unittests (as in the keyword) seem to have going for them is to be  
 an aid to documentation.

The huge benefit of D's unit tests are that the test is physically close  
to the code it's testing.  This helps in debugging and managing updates.   
When you update the code to fix a bug, the unit test is right there to  
modify as well.

The whole point of unittests are, if they are not easy to do and  
conveniently located, people won't do them.  You may have a really good  
system and good coding practices that allows you to implement tests the  
way you do.  But I typically will forget to update tests when I'm updating  
code.  It's much simpler if I can just add a new line right where I'm  
fixing the code.

 What I do instead is put unit tests into separate modules, and use a  
 custom build system that compiles, links AND executes the unit test  
 modules (when out of date of course). The build fails if a test does not  
 pass.

 The separation of the test from the code under test has plenty of  
 advantages and no down-side that I can see - assuming you use a build  
 system that understands the idea. Some of the advantages are:
 * No code-bloat or unnecessary imports.

Not a real problem with version(unittest).

 * Much easier to manage inter-module dependencies.

Not sure what you mean here.

 * The tests can be fairly elaborate, and can serve as well-documented  
 examples of how to use the code under test.

This is not an against for unit tests, they can be this way as well.  Unit  
testing phobos takes probably a minute on my system, including building  
the files.  They are as complex as they need to be.

 * Since they only execute during the build, and even then only when out  
 of date, they can afford to be more complete tests (ie use plenty of cpu  
 time)

IMO unit tests should not be run along with the full application.  I'd  
suggest a simple unit test blank main function.  I think even dmd (or  
rdmd?) will do this for you.

There is no requirement to also run your application when running unit  
tests.

 * If the code builds, you know all the unit tests pass. No need for a  
 special unittest build and manual running of assorted programs to see if  
 the tests pass.

This all stems from your assumption that you have to run unittests along  
with your main application.

When I use D unit tests, my command line is:

<command to build library/app> unittests

e.g.

make unittests

No special build situations are required.  You can put this into your  
normal build script if you wish (i.e. build 2 targets, one unit tested one  
and one release version).

i.e.:

all: unittests app

 * No need for special builds with -unittest turned on.

Instead you need a special build of other external files?  I don't see any  
advantage here -- on one hand, you are building special extra files, on  
the other hand you are building the same files you normally build (which  
you should already have a list of) with the -unittest flag.  I actually  
find the latter simpler.

-Steve

Mar 24 2011

Graham St Jack <Graham.StJack internode.on.net> writes:

On 25/03/11 06:09, Steven Schveighoffer wrote:
 On Thu, 24 Mar 2011 00:17:03 -0400, Graham St Jack 
 <Graham.StJack internode.on.net> wrote:

 Regarding unit tests - I have never been a fan of putting unit test 
 code into the modules being tested because:
 * Doing so introduces stacks of unnecessary imports, and bloats the 
 module.

 As Jonathan says, version(unittest) works.  No need to bloat 
 unnecessarily.

Agreed. However, all the circularity problems pop up when you compile 
with -unittest.

 * Executing the unittests happens during execution rather than during 
 the build.

 Compile-time code execution is not a good idea for unit tests.  It is 
 always more secure and accurate to execute tests in the environment of 
 the application, not the compiler.

I didn't say during compilation - the build tool I use executes the test 
programs automatically.

 Besides, this is an implementation detail.  It is easily mitigated.  
 For example, phobos' unit tests can be run simply by doing:

 make -f posix.mak unittest

 and it builds + runs all unit tests.  This can be viewed as part of 
 the "Build process".

The problem I have with this is that executing the tests requires a 
"special" build and run which is optional. It is the optional part that 
is the key problem. In my last workplace, I set up a big test suite that 
was optional, and by the time we got around to running it, so many tests 
were broken that it was way too difficult to maintain. In my current 
workplace, the tests are executed as part of the build process, so you 
discover regressions ASAP.

 All unittests (as in the keyword) seem to have going for them is to 
 be an aid to documentation.

 The huge benefit of D's unit tests are that the test is physically 
 close to the code it's testing.  This helps in debugging and managing 
 updates.  When you update the code to fix a bug, the unit test is 
 right there to modify as well.

I guess that was what I was alluding to as well. I certainly agree that 
having the tests that close is handy for users of a module. The extra 
point you make is that the unittest approach is also easier for the 
maintainer, which is fair enough.

 The whole point of unittests are, if they are not easy to do and 
 conveniently located, people won't do them.  You may have a really 
 good system and good coding practices that allows you to implement 
 tests the way you do.  But I typically will forget to update tests 
 when I'm updating code.  It's much simpler if I can just add a new 
 line right where I'm fixing the code.

In practice I find that unit tests are often big and complex, and they 
deserve to be separate programs in their own right. The main exception 
to this is low-level libraries (like phobos?).

 What I do instead is put unit tests into separate modules, and use a 
 custom build system that compiles, links AND executes the unit test 
 modules (when out of date of course). The build fails if a test does 
 not pass.

 The separation of the test from the code under test has plenty of 
 advantages and no down-side that I can see - assuming you use a build 
 system that understands the idea. Some of the advantages are:
 * No code-bloat or unnecessary imports.

 Not a real problem with version(unittest).

 * Much easier to manage inter-module dependencies.

 Not sure what you mean here.

I mean that the tests typically have to import way more modules than the 
code under test, and separating them is a key step in eliminating 
circular imports.

 * The tests can be fairly elaborate, and can serve as well-documented 
 examples of how to use the code under test.

 This is not an against for unit tests, they can be this way as well.  
 Unit testing phobos takes probably a minute on my system, including 
 building the files.  They are as complex as they need to be.

Conceded - it doesn't matter where the tests are, they can be as big as 
they need to be.

As for the time tests take, an important advantage of my approach is 
that the test programs only execute if their test-passed file is out of 
date. This means that in a typical build, very few (often 0 or 1) tests 
have to be run, and doing so usually adds way less than a second to the 
build time. After every single build (even in release mode), you know 
for sure that all the tests pass, and it doesn't cost you any time or 
effort.

 * Since they only execute during the build, and even then only when 
 out of date, they can afford to be more complete tests (ie use plenty 
 of cpu time)

 IMO unit tests should not be run along with the full application.  I'd 
 suggest a simple unit test blank main function.  I think even dmd (or 
 rdmd?) will do this for you.

 There is no requirement to also run your application when running unit 
 tests.

That is my point exactly. I don't run tests as part of the application - 
the tests are separate utilities intended to be run automatically by the 
build tool. They can also be run manually to assist in debugging when 
something goes wrong.

 * If the code builds, you know all the unit tests pass. No need for a 
 special unittest build and manual running of assorted programs to see 
 if the tests pass.

 This all stems from your assumption that you have to run unittests 
 along with your main application.

 When I use D unit tests, my command line is:

 <command to build library/app> unittests

 e.g.

 make unittests

 No special build situations are required.  You can put this into your 
 normal build script if you wish (i.e. build 2 targets, one unit tested 
 one and one release version).

 i.e.:

 all: unittests app

 * No need for special builds with -unittest turned on.

 Instead you need a special build of other external files?  I don't see 
 any advantage here -- on one hand, you are building special extra 
 files, on the other hand you are building the same files you normally 
 build (which you should already have a list of) with the -unittest 
 flag.  I actually find the latter simpler.

 -Steve

The difference in approach is basically this:

With unittest, tests and production code are in the same files, and are 
either built together and run together (too slow); or built separately 
and run separately (optional testing).

With my approach, tests and production code are in different files, 
built at the same time and run separately. The build system also 
automatically runs them if their results-file is out of date (mandatory 
testing).


Both approaches are good in that unit testing happens, which is very 
important. What I like about my approach is that the tests get run 
automatically when needed, so regressions are discovered immediately (if 
the tests are good enough). I guess you could describe the difference as 
automatic incremental testing versus manually-initiated batch testing.


-- 
Graham St Jack

Mar 24 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Thu, 24 Mar 2011 20:38:30 -0400, Graham St Jack  
<Graham.StJack internode.on.net> wrote:

 On 25/03/11 06:09, Steven Schveighoffer wrote:
 On Thu, 24 Mar 2011 00:17:03 -0400, Graham St Jack  
 <Graham.StJack internode.on.net> wrote:

 Regarding unit tests - I have never been a fan of putting unit test  
 code into the modules being tested because:
 * Doing so introduces stacks of unnecessary imports, and bloats the  
 module.

 As Jonathan says, version(unittest) works.  No need to bloat  
 unnecessarily.

 Agreed. However, all the circularity problems pop up when you compile  
 with -unittest.

This might be true in some cases, yes.  It depends on how much a unit test  
needs to import.

 * Executing the unittests happens during execution rather than during  
 the build.

 Compile-time code execution is not a good idea for unit tests.  It is  
 always more secure and accurate to execute tests in the environment of  
 the application, not the compiler.

 I didn't say during compilation - the build tool I use executes the test  
 programs automatically.

Your build tool can compile and execute unit tests automatically.

 Besides, this is an implementation detail.  It is easily mitigated.   
 For example, phobos' unit tests can be run simply by doing:

 make -f posix.mak unittest

 and it builds + runs all unit tests.  This can be viewed as part of the  
 "Build process".

 The problem I have with this is that executing the tests requires a  
 "special" build and run which is optional. It is the optional part that  
 is the key problem. In my last workplace, I set up a big test suite that  
 was optional, and by the time we got around to running it, so many tests  
 were broken that it was way too difficult to maintain. In my current  
 workplace, the tests are executed as part of the build process, so you  
 discover regressions ASAP.

It is as optional as it is to build external programs.  It all depends on  
how you set up your build script.

phobos could be set up to build and run unit tests when you type make, but  
it isn't because most people don't need to unit test released code, they  
just want to build it.

 The whole point of unittests are, if they are not easy to do and  
 conveniently located, people won't do them.  You may have a really good  
 system and good coding practices that allows you to implement tests the  
 way you do.  But I typically will forget to update tests when I'm  
 updating code.  It's much simpler if I can just add a new line right  
 where I'm fixing the code.

 In practice I find that unit tests are often big and complex, and they  
 deserve to be separate programs in their own right. The main exception  
 to this is low-level libraries (like phobos?).

It depends on the code you are testing.  Unit testing isn't for every  
situation.  For example, if you are testing that a client on one system  
can properly communicates with a server on another, it makes no sense to  
run that as a unit test.

Unit tests are for testing units -- small chunks of a program.  The point  
of unit tests is:

a) you are testing a small piece of a large program, so you can cover that  
small piece more thoroughly.
b) it's much easier to design tests for a small API than it is to design a  
test for a large one.  This is not to say that the test will be small, but  
it will be more straightforward to write.
c) if you test all the small components of a system work the way they are  
designed, then the entire system should be less likely to fail.

This does not mean that to test a function or class cannot be complex.

I can give you an example.  It takes little thinking and effort to test a  
math function like sin.  You provide your inputs, and test the outputs.   
It's a simple test.  When was the last time you worried that sin wasn't  
implemented correctly?  If you have a function that uses sin quite a bit,  
you are focused on testing the function, not sin, because you know sin  
works.  So the test of the function that uses sin gets simpler also.

 * Much easier to manage inter-module dependencies.

 Not sure what you mean here.

 I mean that the tests typically have to import way more modules than the  
 code under test, and separating them is a key step in eliminating  
 circular imports.

This can be true, but it also may be an indication that your unit test is  
over-testing.  You should be focused on testing the code in the module,  
not importing other modules.

 As for the time tests take, an important advantage of my approach is  
 that the test programs only execute if their test-passed file is out of  
 date. This means that in a typical build, very few (often 0 or 1) tests  
 have to be run, and doing so usually adds way less than a second to the  
 build time. After every single build (even in release mode), you know  
 for sure that all the tests pass, and it doesn't cost you any time or  
 effort.

This can be an advantage time-wise.  It depends on the situation.   
dcollections builds in less than a second, but the unit tests build takes  
about 20 seconds (due to a compiler design issue).  However, running unit  
tests is quite fast.

Note that phobos unit tests are built separately (there is not one giant  
unit test build, each file is unit tested separately), so it is still  
possible to do this with unit tests.

 The difference in approach is basically this:

 With unittest, tests and production code are in the same files, and are  
 either built together and run together (too slow); or built separately  
 and run separately (optional testing).

Or built side-by-side and unit tests are run automatically by the build  
tool.

 With my approach, tests and production code are in different files,  
 built at the same time and run separately. The build system also  
 automatically runs them if their results-file is out of date (mandatory  
 testing).

Unit tests can be built at the same time as building your production code,  
and run by the build tool.  You have obviously spent a lot of time  
creating a system where your tests only build when necessary.  I believe  
unit tests could also build this way if you spent the time to get it  
working.

 Both approaches are good in that unit testing happens, which is very  
 important. What I like about my approach is that the tests get run  
 automatically when needed, so regressions are discovered immediately (if  
 the tests are good enough). I guess you could describe the difference as  
 automatic incremental testing versus manually-initiated batch testing.

Again, the manual part can be scripted, as can any manual running of a  
program.

What I would say is a major difference is that using unittest is prone to  
running all the unit tests for your application at once (which I would  
actually recommend), whereas your method only tests things you have deemed  
need testing.  I think unittests can be done that way too, but it takes  
effort to work out the dependencies.

I would point out that using the "separate test programs" takes a lot of  
planning and design to get it to work the way you want it to work.  As you  
pointed out from previous experience, it's very easy to *not* set it up to  
run automatically.

With D unit tests, I think the setup to do full unit tests is rather  
simple, which is a bonus.  But it doesn't mean it's for everyone's taste  
or for every test.

-Steve

Mar 25 2011

Graham St Jack <Graham.StJack internode.on.net> writes:

I sounds like we actually agree with each other on all the important 
points - its just the different starting positions made our 
near-identical ideas about testing to look different.

Thanks for the discussion.

-- 
Graham St Jack

Mar 27 2011

D Programming

C/C++ Programming

Other

digitalmars.D - Strategies for resolving cyclic dependencies in static ctors