www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Const correctness revisited (proposal)

reply Oliver Dathe <o.dathe gmx.de> writes:
Hello D folks,

I'd like to introduce some suggestions for D2:
   1.) Enforceable immutability of function calls regarding parameters
   2.) Parameter related tail constness
   3.) Enforceable tail constness on calls

In thread [1] the problem was stated, what a clean D2 version of a 
function may look like, that returns a mutable slice from an input 
string which (the input string) may not be mutable by the function 
itself. There were no really clean solutions avoiding casts and 
templates. Therefore I'd like to suggest to think about how to force 
constness/immutability of functions and calls regarding parameters.

Proposals:

1.) Enforceable immutability of function calls regarding parameters

I'd like to suggest that, at the time we call a function, we may be able 
to force some overlay constness regarding the parameters. e.g.

   char[] foo(char[] source, char[] pattern) {} // example of Steven [1]
   char[] x = "some input";
   char[] y = foo(const x, "somepattern");      // (1) note the const

(1) The const means we want foo() to not mutate this parameter (with the 
respective transitivity to underlying calls which take x as parameter). 
If there is no version of foo() that could meet this requirement (at 
least implicitly), it is an error. Note that the typeof(source) remains 
plain char[] here.

Ok, usually if the creator knew he does not want to mutate a parameter 
he would declared it as const. But, there may be several reasons not to 
do so: a.) D1 b.) if he wants to return a mutable char[] slice of source 
he has to cast away the const which is not desirable in general.

Having parameter source be of type char[] gives us the opportunity to 
return a mutable slice without casting away constness. Calling foo(const 
x, ...); gives us the opportinity to ensure that x is not mutable by 
foo() while already having typeof(source)==char[].

You may also want to take some void d1function(char[] x); and call it 
e.g. with some const(char)[] argument, which may then internally be 
translated to casting away the constness but forcing immutability of x 
in d1function().

Now let us extend this to a more general scheme:


2.) Parameter related tail const correctness

If we think of how to bring this kind of overlay constness to the 
function declaration, we may come up with adding a tail constness 
regarding parameters:

   char[] bar(char[] a const, int b) {...} // (2)

(2) bar() is forced to be compilable if parameter a virtually was const 
char[] but the actual typeof(a) remains char[]. If bar() tries to mutate 
parameter a, that would be an error.

This also implies that if bar() calls foo(x,...) the call internally 
would force this overlay constness in foo() regarding x (transitive), so 
the call silently becomes foo(const x,...); behind the scenes.

Advantages of parameter related tail constness:
a.) It is some sort of injected, transparent and transient overlay 
constness only regarding the functions scope.
b.) Usual advantages with const correctness like detecting accidental 
mutation but you may already stick with the basic types.
c.) As expressive to the reader as if the parameter type was constified 
  but it just stays basic.
d.) You may stick with the easygoing D1 typing style, but you get the 
opportunity to let a 'silent const guard' work for you behind the scenes.
e.) It just naturally extends the tail constness concept from 'this' to 
parameters.
f.) May encourage SafeD [3][4]
g.) Compiler/Optimizer may already deduce this attribute?
h.) No different bytecode version

Recall Stevens problem [1] is solved with either

   char[] foo(char[] input const; char[] pattern){...}// at declaration
   // or
   char[] mutableslice = foo(const x; "mypattern");   // at function call
   // or both


3.) Enforceable tail constness on calls

Try to force the call to be immutable regarding their respective object. 
The call must either be explicitly tail const [2] or implicitly by not 
mutating the object.

   z = obj.foo(x,y) const;     // (3)
   z = obj.foo(x,y) invariant; // (4)
   func("input") invariant;    // (5)

(3) Forces foo() to be explicitly or implicitly tail const. If that is 
not possible it is an error.
(4) and (5) world-immutability. See [2]


[1] 
http://www.digitalmars.com/d/archives/digitalmars/D/const_debacle_68011.html
[2] http://s3.amazonaws.com/dconf2007/WalterAndrei.pdf p.29
[3] 
http://www.digitalmars.com/d/archives/digitalmars/D/Memory_Safety_68031.html
[4] http://www.digitalmars.com/d/2.0/safed.html

-Oliver
Mar 23 2008
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Oliver Dathe wrote:
 Hello D folks,
 
 I'd like to introduce some suggestions for D2:
   1.) Enforceable immutability of function calls regarding parameters
   2.) Parameter related tail constness
   3.) Enforceable tail constness on calls
 
 In thread [1] the problem was stated, what a clean D2 version of a 
 function may look like, that returns a mutable slice from an input 
 string which (the input string) may not be mutable by the function 
 itself. There were no really clean solutions avoiding casts and 
 templates.

The reason there is no clean way to do this is because it is a fundamentally unsound operation to do it. Doing so violates the const-ness contract of the function. Please see my posts in the "const debacle" thread. In other words, T[] f(const(T)[] t) { return t; } is wrong, wrong, wrong, and must not compile.
Mar 24 2008
next sibling parent reply Oliver Dathe <o.dathe gmx.de> writes:
Walter Bright wrote:
 The reason there is no clean way to do this is because it is a 
 fundamentally unsound operation to do it. Doing so violates the 
 const-ness contract of the function. Please see my posts in the "const 
 debacle" thread. In other words,
 
    T[] f(const(T)[] t)
    {
     return t;
    }
 
 is wrong, wrong, wrong, and must not compile.

I completely agree on that. That's right what pushed me to think about tail constness regarding parameters. You could have the cont-ness contract without touching the type. T[] f(T[] t const) { return t; } It is a transient const-ness not a reflexive. It does not creep into types or out of the function. It is transitive in the way that it has to creep down subsequent calls within f() to functions that take t as paramter. That's all pretty sound isn't it? It as well avoids /polution/ of types at the same time.
Mar 24 2008
next sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Oliver Dathe wrote:
 That's right what pushed me to think about tail constness regarding 
 parameters. You could have the cont-ness contract without touching the 
 type.
 
   T[] f(T[] t const) {
     return t;
   }
 
 It is a transient const-ness not a reflexive. It does not creep into 
 types or out of the function. It is transitive in the way that it has to 
 creep down subsequent calls within f() to functions that take t as 
 paramter. That's all pretty sound isn't it? It as well avoids /polution/ 
 of types at the same time.

That would be of very limited utility because I could not pass a const(T)[] array to the function. It's weird to have a function that doesn't modify its parameters where one could not pass const data to.
Mar 24 2008
parent reply Oliver Dathe <o.dathe gmx.de> writes:
Walter Bright wrote:
 That would be of very limited utility because I could not pass a 
 const(T)[] array to the function. It's weird to have a function that 
 doesn't modify its parameters where one could not pass const data to.

Ok I'll try something... Definition: If f has a 'tail const paramater' t of type T, then f must be compilable if it also is compilable for typeof(t)=const(T) on data operations. Otherwise it is an error. T[] f(T[] t const) { T sum; foreach (i, ti; t) { sum+=ti; } // ok auto h = t.toHash(); // ok, invariant data operation static assert (is(typeof(t)==T[])); // ok, no data operation // t[2] = ...; // error // t.length = 17; // error return t; } ... char[] a, y; const(char)[] b; const(char[]) c; ... // all following three calls will succeed // note: for all three returntype==typeof(y)==char[] holds y=f(a); y=f(b); y=f(c); You can see: If f is compilable, then the call to f succeeds if T_param_at_call is implicitly convertible to const(T_param_at_func). So a call to a function char[] foo(char[] x const) {...} with a const(char)[] parameter also succeeds. Would this also imply the call is sound? Btw: How or where is sound in this context defined? In general: That we define types with const properties where it makes sense is really fine, no doubt! However, the problem is if we want to pass an immutable view of usual mutable data to a function, the real "solution" is apparently not to just use some const(char)[] as input type because a.) the Ts behind input usually are not really const but we just want an immutable view on them. By using const(char)[] b.) we are forced to use an evil const-away cast when we want to mutate the result afterwards. To some degree that is contradictory and counterproductive (cast-away-const==cast-away-sound right?). I simply aim to be able to prove: STATE before = STATE(x); // x + anything reachable through x y=foo(x); static assert (STATE(x)==before); // must be provable at compile time This currently can be ensured by x beeing const. That is the way D2 currently forces. But this constness must not necessarily be a natural attribute of the data! In most scenarios we may simply want to ensure foo() is not mutating x without artificially constifying the data. I think that is the heart of the ever upcoming 'const debacle'. Also imagine: You have a lot of D1 code with 'in T[]' parameters but you'd like to use them with a lot of const(T)[] variables where it makes sense. Note: This is the same scenario as you mentioned in your post before. Currently that will fail for the same reason. Now, let us try to let calls to D1 functions succeed that do not explicitly declare const parameters or tail const parameters. In order use them with our const(T) input variable we could either a.) call foo(const input) as proposed in my original post (enforce immutability of the function regarding the parameter) or b.) since input is const anyways, parameter tail constancy may automatically be forced. If we do so, we would also have a huge gain of D1 code that is usable in a D2 const styled world! Not at least the optimizer will appreciate this but also having tail const parameters in declarations will be of valueble information to the user/reader of the code/docs. Another Question: The transitive behaviour of const will also hold for the planned D tail const methods? This would have great impact on a D2 version of the following dirty C++ code: #include<cassert> class A { char* p; public: A() : p(new char[50]) {} ~A() { delete[] p; } // within tail const methods typeof(this) is (A const *) // but typeof(this->ptr) remains char* in C++, (nontransitive) char* getp() const { return p; } void mutate() const { p[5]=42; } }; int main() { A a; char* ptr = a.getp(); ptr[5] = 17; a.mutate(); assert(ptr[5]==42); return 0; } So in D2 within tail const methods typeof(this.p)==const(char*) and typeof(this.p[0])==const(char) holds? Did I get this right? That'd be consistent with the rest and pretty cool I think.
Mar 24 2008
parent Oliver Dathe <o.dathe gmx.de> writes:
Oliver Dathe wrote:

 
   T[] f(T[] t const) {
     T sum;
     foreach (i, ti; t) { sum+=ti; }     // ok
     auto h = t.toHash();                // ok, invariant data operation
     static assert (is(typeof(t)==T[])); // ok, no data operation
     // t[2] = ...;                      // error
     // t.length = 17;                   // error
     return t;
   }
   ...
   char[]        a, y;
   const(char)[] b;
   const(char[]) c;
   ...
   // all following three calls will succeed
   // note: for all three returntype==typeof(y)==char[] holds
   y=f(a);
   y=f(b);
   y=f(c);

Ok, the last two calls to f were evil, because you get a mutable slice of immutable data and therefore must not be compilable, as Walter pointed out before! But you could use: T f(T)(T t const) { ... } or: T f(return T t const) { ... } // See Walter/Andrei DConf07 p.38/39 Then you could use it as intended: char[] a; const(char)[] b; const(char[]) c; ... auto ya = f(a); static assert (is(typeof(ya)==char[])); ya[17]='5'; ... auto yb = f(b); static assert (is(typeof(yb)==const(char)[])); yb.length = 42; ... auto yc = f(c); static assert (is(typeof(yc)==const(char[]));
Mar 25 2008
prev sibling parent Oliver Dathe <o.dathe gmx.de> writes:
Janice Caron wrote:
 Just to be pedantic, the word is "constancy", not "const-ness". :-)

Walter started with "const-ness", i just added another typo :) For the rest see my reply to Walter. Regarding the slices: In the DConf slides, Walter and Andrei propose a superior way of distinguishing between 'window' slices T[] and expandable 'genuine' arrays T[new]. See page 26.
Mar 24 2008
prev sibling parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 24/03/2008, Oliver Dathe <o.dathe gmx.de> wrote:
  That's right what pushed me to think about tail constness regarding
  parameters. You could have the cont-ness contract without touching the type.

Just to be pedantic, the word is "constancy", not "const-ness". :-)
    T[] f(T[] t const) {
      return t;
    }

What if I wanted to distinguish between const(T)[] and const(T[])? What if I wanted the type of t to be "const(T[])[]"? Declaring "T[][] const" doesn't show where the brackets are supposed to go. Const-at-the-end doesn't make sufficient distinction. Furthermore, as I'm sure Walter will point out if he replies to this before I do, "transient constancy" will break const correctness, and so /must not be allowed/, whatever the syntax. Not that I want this, but if you really wanted to allow slices to be returned, a better way would be something like sliceof(t) f(const(T)[] t) { return t; } This would not introduce any new kinds of constancy, however, the compiler would now have to /prove/ that the return value was guaranteed always to be a slice of t, and generate a compile error if not. Proving that might be non-trivial! So, even though I believe that returning such a slice is not quite so unsound as Walter believes, I nonetheless argue that doing so would be too complicated to implement, and therefore should not be implemented. You don't need it, because you can just return indeces instead, and then take the slice at the call site. Returning indeces is the easy way, and seems sufficient to me.
Mar 24 2008
next sibling parent reply BCS <ao pathlink.com> writes:
Reply to Janice,

 however, the
 compiler would now have to /prove/ that the return value was
 guaranteed always to be a slice of t,

The fn could also return anything that is totaly mutable.
Mar 24 2008
parent BCS <BCS pathlink.com> writes:
Janice Caron wrote:
 On 25/03/2008, BCS <ao pathlink.com> wrote:
 
The fn could also return anything that is totaly mutable.

Not if the "in" function parameter was invariant.

Granted. So i guess it would be correct to say "suitably mutable".
Mar 25 2008
prev sibling parent "Janice Caron" <caron800 googlemail.com> writes:
On 25/03/2008, BCS <ao pathlink.com> wrote:
 The fn could also return anything that is totaly mutable.

Not if the "in" function parameter was invariant.
Mar 25 2008