D - Implicit Asserts as Optimization/Compile Warnings/Debugging

Russ Lewis (113/113) Nov 30 2003 What if an assert in one place in the program was translated to many

Walter (5/6) Nov 30 2003 I think you're right. There is a lot of potential for an advanced compil...

Russ Lewis (23/32) Nov 30 2003 We could include this into the language standard now, in a way that

Walter (6/38) Dec 01 2003 compiler

Russ Lewis (35/37) Dec 01 2003 Sure, it can, but I think that that has problems. The programmer who

Berin Loritsch (16/23) Dec 10 2003 I dunno. Java has runtime assertion checking, and the thought process t...

Russ Lewis <spamhole-2001-07-16 deming-os.org> writes:

What if an assert in one place in the program was translated to many 
different places in the program, reflecting the explicit and implicit 
assumptions of the programmer?  What if this system was used to allow 
for better optimization and compile-time error detection?

First, as I enter this, let me say that I am aware of the Halting 
Problem.  Most likely, any such optimization system would have to have a 
cap...it would stop after searching x deep (and, probably, issue a 
warning saying that it had to stop).  But let's explore the potential it 
has...

Second, I am aware that the mechanism I describe here only works if you 
compile all of the .d files together at once...if any function 
implementations are hidden, it becomes much more difficult (or 
impossible) to perform these optimizations.

My theory is that at any given line of code there are any number of 
assert()s which are true.  Some are explicit (stated in an earlier line, 
where variable values have not been altered) and some are implicit 
(stated later in the program, stated in calling functions, in called 
functions, etc.).  I'm thinking that the compiler should propogate 
assert()s as much as possible at compile time in an attempt to find dead 
code, bugs, and optimization opportunities.

The interesting thing about this is that it allows, in a sense, to have 
the program to describe its own compile-time error.  Basically, you make 
it a compile-time error (or warning?) if the compiler can detect any 
path where an assert() is hit.  First, the trivial case.  This function 
generates a compile-time error:

void foo()
{
    int a = 1;
    assert(a == 0);
}

However, this section of code also causes an error:

void foo(int arg)
body { assert(arg > 0); ... }

void bar()
{
    foo(0);
}

This is pretty trivial.  But things get more interesting if the compiler 
communicates implicit asserts around the program.  Take this example:

void foo(int arg)
body { assert(arg > 0); ... }

void bar(int arg)
{
    foo(arg);
    if(arg == 0)
        {    ...    }
}

Now, the compiler knows from the assert() in foo() that arg must be >0. 
  So it adds an implicit assert at compile time to bar(), which calls foo():

void bar(int arg)
{
    /* implicit */ assert(arg > 0);
    foo(arg)
    if(arg == 0)
        {    /* dead code...can't be reached */    }
}

(Of course, this implicit assert, once added to bar(), gets propogated 
up to all functions which call bar(), and so on.)

So far, this isn't very advanced.  But let's push things a little 
further, using the example where the implicit default case (of a switch) 
throws an exception (something people are debating a lot in a previous 
thread). First, let's make this assumption: any exception that gets all 
the way out past main() to the loader function is a program bug.  It 
means we didn't handle something properly.  So, we place an assertion in 
the loader function

int loader_function_foo(...)
{
    try {
        ...call main() here, with the right args...
    }
    catch    /* catch anything & everything */
    {
        assert(0);
    }
}

So, this means that if the compiler can detect any code path which 
results in an exception that is not caught, it will treat it as a 
compile-time error.  So, look at how the compiler can transform a switch 
statement that lacks a default case:

switch(var)
{
case 1:    ...    break;
case 2:    ...    break;
case 3:    ...    break;
/* this line implicitly added by the compiler */ default: throw 
NoMatchingCaseException;
}

Now, if the compiler looks up the call stack (as much as it is able) and 
finds no catch() for that exception, it further transforms the switch:

switch(var)
{
case 1:    ...    break;
case 2:    ...    break;
case 3:    ...    break;
/* this line implicitly added by the compiler */ default: assert(0);
}

Finally, this assertion is moved up in the function:

/* this line implicitly added by the compiler */ assert(var == 1 || var 
== 2 || var == 3);
switch(var)
{
case 1:    ...    break;
case 2:    ...    break;
case 3:    ...    break;
}

This assertion can continue to be carried up to calling functions.  The 
compiler will (in most cases, giving our respects to the Halting 
Problem) be able to determine that either the default case will never be 
hit, or that the default case can be hit.  In the latter situation, it 
fires off a compile-time error.  In the former case, the code is more 
optimized because the function does not have to have an added default: 
statement.

Thoughts?

Nov 30 2003

"Walter" <walter digitalmars.com> writes:

"Russ Lewis" <spamhole-2001-07-16 deming-os.org> wrote in message
news:bqd07o$che$1 digitaldaemon.com...
 Thoughts?

I think you're right. There is a lot of potential for an advanced compiler
to make use of asserts. I also agree that if the compiler can prove that an
assert will be tripped, that should be a compile time error.

Nov 30 2003

Russ Lewis <spamhole-2001-07-16 deming-os.org> writes:

Walter wrote:
 "Russ Lewis" <spamhole-2001-07-16 deming-os.org> wrote in message
 news:bqd07o$che$1 digitaldaemon.com...
 
Thoughts?

 
 
 I think you're right. There is a lot of potential for an advanced compiler
 to make use of asserts. I also agree that if the compiler can prove that an
 assert will be tripped, that should be a compile time error.

We could include this into the language standard now, in a way that 
allows space both for compilers that support the feature and those that 
do not.  I would suggest a syntax where we overload assert() (at least) 
2 ways:

A: assert(expr);
B: assert(expr,uint);

A:
Existing assert().  Non-searching compilers (like current dmd) just 
compile in a test, with a throw if it fails, for debug builds. 
Searching compilers can copy this assert as many times as they deem 
appropriate.

B:
Programmer requests that the compiler search this assert() at least this 
many plys deep.  (Compiler may search even further if so desired.) 
Non-searching compilers should issue a WARNING stating that they are 
unable to search.  Searching compilers should offer compile-time WARNING 
if the search cannot conclusively determine that the assert() is safe 
before running out of time.

In all cases, searching compilers would issue a compile time ERROR if 
they detected any path where the assert might fail.

Thoughts?  Is this a good syntax?  If so, can we add it to the official 
syntax now, even if dmd doesn't support searching yet?

Nov 30 2003

"Walter" <walter digitalmars.com> writes:

"Russ Lewis" <spamhole-2001-07-16 deming-os.org> wrote in message
news:bqemvh$2ohu$1 digitaldaemon.com...
 Walter wrote:
 "Russ Lewis" <spamhole-2001-07-16 deming-os.org> wrote in message
 news:bqd07o$che$1 digitaldaemon.com...

Thoughts?


 I think you're right. There is a lot of potential for an advanced


compiler
 to make use of asserts. I also agree that if the compiler can prove that


an
 assert will be tripped, that should be a compile time error.

 We could include this into the language standard now, in a way that
 allows space both for compilers that support the feature and those that
 do not.  I would suggest a syntax where we overload assert() (at least)
 2 ways:

 A: assert(expr);
 B: assert(expr,uint);

 A:
 Existing assert().  Non-searching compilers (like current dmd) just
 compile in a test, with a throw if it fails, for debug builds.
 Searching compilers can copy this assert as many times as they deem
 appropriate.

 B:
 Programmer requests that the compiler search this assert() at least this
 many plys deep.  (Compiler may search even further if so desired.)
 Non-searching compilers should issue a WARNING stating that they are
 unable to search.  Searching compilers should offer compile-time WARNING
 if the search cannot conclusively determine that the assert() is safe
 before running out of time.

 In all cases, searching compilers would issue a compile time ERROR if
 they detected any path where the assert might fail.

 Thoughts?  Is this a good syntax?  If so, can we add it to the official
 syntax now, even if dmd doesn't support searching yet?

I think the compiler can potentially handle this automatically without any
additional syntax.

Dec 01 2003

Russ Lewis <spamhole-2001-07-16 deming-os.org> writes:

Walter wrote:
 I think the compiler can potentially handle this automatically without any
 additional syntax.

Sure, it can, but I think that that has problems.  The programmer who 
uses a searching compiler will assume, when he writes an assert, that 
one of two things are going to happen.  Either
a) The compiler searches the program and determines that my assert is 
for-sure safe; it thus removes it from the program altogether, or
b) The compiler searches the program and finds and error; it tells me so 
with a compile-time error.

Thus, asserts become not just a sanity checking tool, but a very 
powerful unittesting tool.  However, what happens when there is a third 
case:
c) The compiler tries to search the assert() but gives up after a little 
while.  It doesn't tell me, though, so I have potential of runtime 
errors at any arbitrary time in the future.

If (c) is possible, then I think that programmers won't use assert()s to 
their full potential, nor will they trust them.  So, I would argue, if 
the programmer is expecting the compiler to search a given assert(), but 
the compiler gives up because of Halting Problem troubles, then it 
really ought to report this to the user with a warning.

However, you have repeatedly stated (and I agree) that a multiplicity of 
warnings just causes troubles.  Too many warnings is just as bad as none 
at all.  So, I proposed two syntaxes.

One syntax (without the uint arg) simply makes an assert as we currently 
have them.  Searching compilers may use them, but need not report 
warnings if the search runs out of time.

The other syntax (with the uint arg) is an explicit statement from the 
programmer that he wants this assert() to be searched.  Thus, compilers 
are required to do their best to search, and if they cannot acheive it, 
then they are required to post a warning.

IMHO, this gives the best of both worlds - it gives a "clean" output for 
the simply written program, and it gives a "detailed" output for 
programs which require it.  Moreover, it gives a clear, language-level 
division between which assert()s MUST be searched, and which can be 
searched at the convenience of the compiler.

Russ

Dec 01 2003

Berin Loritsch <bloritsch d-haven.org> writes:

Russ Lewis wrote:

 
 If (c) is possible, then I think that programmers won't use assert()s to 
 their full potential, nor will they trust them.  So, I would argue, if 
 the programmer is expecting the compiler to search a given assert(), but 
 the compiler gives up because of Halting Problem troubles, then it 
 really ought to report this to the user with a warning.
 

I dunno.  Java has runtime assertion checking, and the thought process there
is pretty simple:

Assertions are debug tools.  When assertions are used, they should never have
side effects (i.e. we are only checking state, not making changes to state).
Therefore, I can use assertions to do complex validations that are expensive
to do.  At runtime if the assertions are enabled (usually during development)
then we perform all those checks.  However, assertions can also be disabled
which means those expensive checks are no longer performed.  The thought here
is that any reasons for those assertions have already been shaken out.

I like using assertions as debug tools.  If assertions can never be disabled
at runtime, then there are certain types of checks that I won't do because it
would be too computationally expensive.

If the language performs runtime checking that is fine--it is a different tool
than an assertion.  Bounds checking is critical with arrays--I would never want
that turned off.

Dec 10 2003

D Programming

C/C++ Programming

Other

D - Implicit Asserts as Optimization/Compile Warnings/Debugging