www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Pre-conditions at compile-time with static arguments

reply bearophile <bearophileHUGS lycos.com> writes:
This is a possible enhancement request for DMD. I am not sure about the
consequences of this, so this is currently not in Bugzilla, I look for opinions.

DMD currently runs functions at compile time only if their call is in a context
where a compile-time value is expected, to avoid troubles. C++0X uses a keyword
to avoid those problems.

[-----------------

Little box:

Time ago I have heard that "enum" is useless and const/immutable suffice. So if
"enum" gets removed from D how do you tell apart the desire to run a function
(not a global call) at compile time from the desire to run it at run time and
assign its result to a run-time constant value?

int bar(int x) {
    return x;
}
void main() {
    enum b1 = bar(10); // runs at compile-time
    const b2 = bar(20); // doesn't run at compile-time    
}

-----------------]

DMD is able to run contracts too at compile-time, as you see in this program
(at the r1 enum):


import std.ctype: isdigit;
int foo(string text, int x)
in {
    assert(x >= 0 && x < text.length);
    foreach (c; text[0 .. x])
        assert(isdigit(c));
} body {
    return 0;
}
enum r1 = foo("123xxx", 4); // Error: assert(isdigit(cast(dchar)c)) failed
void main(string[] args) {
    auto r2 = foo(args[2], (cast(int)args.length) - 5);
    auto r3 = foo("123xxx", 4);
}


test.d(3): Error: assert(x > 0) failed
test.d(7): Error: cannot evaluate foo(-1) at compile time
test.d(7): Error: cannot evaluate foo(-1) at compile time


Given that pre-conditions are meant to run fast and to be (nearly) pure, this
is my idea: when you call a function with all arguments known at compile time
(as at the r3 variable) the compiler runs just the pre-condition of that
function at compile-time (and not its body and post-condition).

The advantage is some bugs are caught at compile-time, and the compiler is kept
simple (I'd like contracts to be verified statically in more situations, but
this requires a more complex compiler). The disadvantage is longer compilation
times, and troubles caused by pre-conditions that can't really run at
compile-time (like a precondion that uses a compiler intrinsic). A way to solve
this last problem is to not raise an error if a pre-condition can't run at
compile-time, and just ignore it, and let it later run normally at run-time.

The variable r2 is a situation where not all arguments of foo() are known at
compile-time, so here the foo pre-condition is not run.


(Note: the variable r2 is a situation where one argument of foo is known at
run-time, and in this case the pre-condition contains a part
(assert(x>=0&&x<text.length)) that's able to use this information to catch a
bug. This is a possible improvement of the idea is to perform this partial
test. This looks less easy to implement, so it's not important).

Bye,
bearophile
Apr 25 2011
next sibling parent reply so <so so.so> writes:
On Mon, 25 Apr 2011 22:44:07 +0300, bearophile <bearophileHUGS lycos.com>  
wrote:

 This is a possible enhancement request for DMD. I am not sure about the  
 consequences of this, so this is currently not in Bugzilla, I look for  
 opinions.

 DMD currently runs functions at compile time only if their call is in a  
 context where a compile-time value is expected, to avoid troubles. C++0X  
 uses a keyword to avoid those problems.
constexpr double pi = 3; // C++ enum double pi = 3; // D constexpr double fun() { return 3; } // C++ double fun() { return 3; } // D For variable declaration, exact same thing. For functions, i am not exactly sure the need for a keyword and i think the D way is simply beautiful and truly generic. One reason i could think of is for library development, you sometimes want to be sure that a function "could be" executed at compile-time. In C++ there is no such thing i am aware of, in D we got unittesting.
 [-----------------

 Little box:

 Time ago I have heard that "enum" is useless and const/immutable  
 suffice. So if "enum" gets removed from D how do you tell apart the  
 desire to run a function (not a global call) at compile time from the  
 desire to run it at run time and assign its result to a run-time  
 constant value?

 int bar(int x) {
     return x;
 }
 void main() {
     enum b1 = bar(10); // runs at compile-time
     const b2 = bar(20); // doesn't run at compile-time
 }
Another question is why would the left hand side of "=" even matter? What is wrong with executing the function(constant) pair always at compile-time (if it could be executed that is)? Only reason using "enum, static, immutable" over "auto" should be "making sure" it runs at compile-time (again, if it can). Obviously we know the reasons for the desire for compile-time execution, is there any reason at all for runtime?
Apr 25 2011
next sibling parent so <so so.so> writes:
 Only reason using "enum, static, immutable" over "auto" should be  
 "making sure" it runs at compile-time (again, if it can).
Besides the obvious reason, making it immutable :)
Apr 25 2011
prev sibling parent so <so so.so> writes:
 Time ago I have heard that "enum" is useless and const/immutable  
 suffice.
AFAIK it wasn't clear, it was said that we adopted this usage of "enum" just because of some bugs in "const/immutable" but we also know "enum" and "immutable" have difference(s), especially related to this issue. We can't replace "enum" with "immutable". Answering my own question on another reply, for arguably same reasons function(constant) pair is not expected to run at compile-time unless it was labeled with "enum" or "static".
Apr 25 2011
prev sibling next sibling parent KennyTM~ <kennytm gmail.com> writes:
On Apr 26, 11 03:44, bearophile wrote:
 [-----------------

 Little box:

 Time ago I have heard that "enum" is useless and const/immutable suffice. So
if "enum" gets removed from D how do you tell apart the desire to run a
function (not a global call) at compile time from the desire to run it at run
time and assign its result to a run-time constant value?

 int bar(int x) {
      return x;
 }
 void main() {
      enum b1 = bar(10); // runs at compile-time
      const b2 = bar(20); // doesn't run at compile-time
 }

 -----------------]
static b1 = bar(20);
Apr 25 2011
prev sibling next sibling parent reply KennyTM~ <kennytm gmail.com> writes:
On Apr 26, 11 03:44, bearophile wrote:
 Given that pre-conditions are meant to run fast and to be (nearly) pure, this
is my idea: when you call a function with all arguments known at compile time
(as at the r3 variable) the compiler runs just the pre-condition of that
function at compile-time (and not its body and post-condition).

 The advantage is some bugs are caught at compile-time, and the compiler is
kept simple (I'd like contracts to be verified statically in more situations,
but this requires a more complex compiler). The disadvantage is longer
compilation times, and troubles caused by pre-conditions that can't really run
at compile-time (like a precondion that uses a compiler intrinsic). A way to
solve this last problem is to not raise an error if a pre-condition can't run
at compile-time, and just ignore it, and let it later run normally at run-time.

 The variable r2 is a situation where not all arguments of foo() are known at
compile-time, so here the foo pre-condition is not run.


 (Note: the variable r2 is a situation where one argument of foo is known at
run-time, and in this case the pre-condition contains a part
(assert(x>=0&&x<text.length)) that's able to use this information to catch a
bug. This is a possible improvement of the idea is to perform this partial
test. This looks less easy to implement, so it's not important).

 Bye,
 bearophile
I feel this isn't right. The input or the precondition is very often not compile-time evaluable, thus it just won't run. An example grabbed from my code near by: foreach (key; keys) // <-- keys is usually not known at compile time assert(key !in _map); // <-- 'in' is not CTFE-able (yet). Also, if the precondition were complaining, and you change some bit of the code and find it can be compiled -- that won't mean anything, as it is likely the change just make the precondition non-CTFE-able. The benefit doesn't worth the cost. (And this is yet another kind of 0% false-positive-rate, non-zero false-negative-rate feature. I bet you'll get a "false sense of security" response. Personally I'd rather have the compiler to catch 50% of compile-time detectable errors than leaving 100% of them to run time.) I also don't believe this still keeps the compiler simple, as now there are 2 kinds of CTFE error the compiler needs to distinguish: assert errors and non-assert errors. I don't know much about the compiler intrinsic, and actually the D front-end is already very complicated :), so this doesn't matter.
Apr 25 2011
parent reply bearophile <bearophileHUGS lycos.com> writes:
KennyTM~:

 Also, if the precondition were complaining, and you change some bit of 
 the code and find it can be compiled -- that won't mean anything, as it 
 is likely the change just make the precondition non-CTFE-able.
You are right, this is bad. But the compile-time interpreter is getting able to run a larger percentage of D code, so that problem becomes smaller, and this idea doesn't give false positives.
 I also don't believe this still keeps the compiler simple,
powerful symbolic constraint solver inside the compiler :-) All the needed tools are already present. Bye, bearophile
Apr 25 2011
parent KennyTM~ <kennytm gmail.com> writes:
On Apr 26, 11 07:25, bearophile wrote:
 KennyTM~:

 Also, if the precondition were complaining, and you change some bit of
 the code and find it can be compiled -- that won't mean anything, as it
 is likely the change just make the precondition non-CTFE-able.
You are right, this is bad. But the compile-time interpreter is getting able to run a larger percentage of D code, so that problem becomes smaller, and this idea doesn't give false positives.
 I also don't believe this still keeps the compiler simple,
powerful symbolic constraint solver inside the compiler :-) All the needed tools are already present. Bye, bearophile
I see. I was comparing to the current situation where such feature does _not_ exist.
Apr 25 2011
prev sibling next sibling parent Walter Bright <newshound1 digitalmars.com> writes:
bearophile wrote:
 Given that pre-conditions are meant to run fast and to be (nearly)
 pure, this is my idea: when you call a function with all arguments
 known at compile time (as at the r3 variable) the compiler runs just
 the pre-condition of that function at compile-time (and not its body
 and post-condition).
 
 The advantage is some bugs are caught at compile-time, and the
 compiler is kept simple (I'd like contracts to be verified statically
 in more situations, but this requires a more complex compiler). The
 disadvantage is longer compilation times, and troubles caused by
 pre-conditions that can't really run at compile-time (like a
 precondion that uses a compiler intrinsic). A way to solve this last
 problem is to not raise an error if a pre-condition can't run at
 compile-time, and just ignore it, and let it later run normally at
 run-time.
We thought about this approach and deliberately did not do it. It is not predictable at compile time whether a function can be executed at compile time or not (the halting problem). Therefore, you'll wind up with cases that silently fail at compile time, and so are put off until runtime. The user cannot tell (without looking at assembly dumps) if it happened at compile time or not. Instead, we opted for a design that either must run at compile time, or must run at run time. Not one that decides one way or the other in an arbitrary, silent, and ever-changing manner. The user must make an explicit choice if code is to be run at compile time or run time. D has had this in operation for several years, and it's been a big success. I don't think we should tamper with the formula without very compelling evidence.
Apr 25 2011
prev sibling parent bearophile <bearophileHUGS lycos.com> writes:
I was distracted and with a unfocused mind, sorry for the late reply.

Walter:

 We thought about this approach and deliberately did not do it.
Thank you for your answer. Sometimes it doesn't harm to think again a bit about a topic.
 It is not predictable at compile time whether a function can be executed
 at compile time or not (the halting problem). Therefore, you'll wind up
 with cases that silently fail at compile time, and so are put off until
 runtime. The user cannot tell (without looking at assembly dumps) if it
 happened at compile time or not.
 
 Instead, we opted for a design that either must run at compile time, or
 must run at run time. Not one that decides one way or the other in an
 arbitrary, silent, and ever-changing manner.
In this discussion I was talking about running just the pre-conditions, and not the whole function, and not its post-condition. This is different because: - Pre-conditions are usually small or smaller than function bodies; - Pre-conditions are usually meant to be fast (and not slower than the function body), so they are probably not too much heavy to run. pre-conditions, unlike debug{} code are meant to run often; - Pre-conditions are often pure (mine). If the "feature" I am talking about gets introduced, D programmers will be encouraged to write more pure preconditions (this probably explains why most other Contract-based systems I've seen don't use free code in contracts as D does, but use a specific expression language. This forces them to be pure and simpler, more fit for analysis!). Even if a function is not marked as "pure", what matters in this discussion is to its pre-condition to be CTFE-pure. The problem you list is of course important for the normally run CT functions, and I agree with the decision. But it's much less important for my idea, because I've seen that finding bug in code is essentially never a deterministic process, it's very probabilistic. People find only some bugs (people today find new bugs even in 10+ years old very-high-quality C code used by everyone (Zlib)), lint tools (including the static analysis flag of Clang I've shown recently) find only some other bugs, and different lints find different bugs. One important factor for those tools is to reduce false positives as much as possible (even if this increases false negatives a little), and this idea of mine produced zero false alarms (if the pre-conditions are written correctly). This feature is useful because when in your code you have struct literals like: Foo f1 = Foo(10, -20); The compiler is able at compile-time to tell you that line of code is wrong because -20 is not acceptable :-) This is useful in many situations. This feature works only if the arguments are known at compile-time, this is a strong limitation, but I think it's better to have it anyway. Even if this feature sometimes gets disabled by turning a CTFE-pure function pre-condition into not CTFE-pure code as you say, the programmer doesn't need to care a lot about this, because even if this change doesn't allow to catch a bug in the code, other bugs too are not found by the compiler. All static analysis tools do the same, they sometimes find a bug, but if you change the code in a way they don't fully understand, they don't find the bug any more. This is why I think your argument doesn't hold. The feature I have proposed is not a language feature, it's a compiler feature (the only change is in user code, that's encouraged to create CTFE-pure pre-conditions). This means that even if DMD doesn't want this idea, future D compilers will be free to adopt it. And from the direction of the tide (as Clang adds better and better analysis for C and C++ code) I think this will be seen as a cheap but useful compiler feature to add (a "low-hanging fruit" because the only needed compiler change I see needed is to not stop the compilation if a pre-condition of a function with statically known arguments turns out to be not runnable). Bye, bearophile
Apr 26 2011