www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 1961] New: Allow "scoped const" contracts

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=1961

           Summary: Allow "scoped const" contracts
           Product: D
           Version: 2.013
          Platform: PC
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: DMD
        AssignedTo: bugzilla digitalmars.com
        ReportedBy: schveiguy yahoo.com


The current const scheme does not allow scoped const.  A scoped const contract
means a function will not change an input parameter or anything referred to
through that input parameter, but the function does not restrict the contract
that the calling function has with the parameter.

First I will describe the problem I am trying to solve, then describe my
solution.

Two examples come to mind, C's strstr function, and C++'s min function, written
in D form without const here:

char[] strstr(char[] source, char[] pattern);

T min(T)(T a, T b);

First strstr.  Source is an input type, and should not be changed in the
function.  But the return value is a substring of source, and so the constancy
must be consistent.  In the current const scheme, this would be implemented as:

const(char)[] strstr(const(char)[] source, const(char)[] pattern)

Now, however, strstr cannot be used in an iterative pattern on a mutable or
invariant string:

char[] x = "hello".dup;

x = strstr(x, "l"); // error

So essentially, strstr is modifying the contract that the calling function has
with x, namely that once strstr is called on x, any result must be const.  This
sort of contract modification is not necessary in the const scheme, as one is
perfectly allowed to change x.

With an invariant version, you cannot re-assign to the argument also:

auto y = "hello";
y = strstr(y, "l"); // error

Again, we're modifying the contract.

If we solve the problem using templates, then we have:

T[] strstr(T)(T[] source, const(T)[] pattern);

Note, however, that this does not achieve what we want.  We are not
guaranteeing that strstr does not modify source in the case of a mutable
argument passed into the function.  There could be a static-if clause in the
function that allows strstr to compile and modify source if it is mutable.  A
user of the function cannot rely on strstr not changing source by looking at
the method signature, he must inspect the implementation.

Now let us look at the second example, min:

T min(T)(T a, T b);

Like the template version of strstr, this does not guarantee the constancy of
it's arguments if T is mutable.

If we add const such that min does not modify it's arguments it becomes:

const(T) min(T)(const(T) a, const(T) b);

And now, min is no longer as useful, because it cannot be used in the most
common case:

t = min(t, maxvalue); // clamp t to be <= maxvalue

if t is mutable or invariant.

What we need is a way to say that a parameter is constant only for the scope of
the function, and then reverts to its original constancy when returning.

Here is my proposal (with help from Janice) on how to solve this problem:

Let the 'inout' keyword describe this contract.  The inout keyword is
completely obsoleted by ref, and so can be used for this purpose.  Used like
const, the inout keyword used on a parameter means that that parameter is
essentially const  for the scope of the function.  When used on the return
value, it means that the return value is casted back to the constancy factor of
the inout parameters at the call-site.  The constancy factor of the parameters
follows the following rules:

  - If all inout parameters are homogeneous, meaning they are all const, all
invariant, all mutable, or all inout (think recursive functions), then the
constancy factor is equivalent to the constancy of the parameters.
  - If any inout parameters differ in constancy, then the constancy factor is
equivalent to const.

The following rules are in force during the function scope:

 - inout(T) can be implicitly cast to const(T)
 - inout(T) is transitively inout, meaning access to any member of T is also
inout.
 - The only thing implicitly castable to inout(T) is T.init (think return
null).

At the call-site, the compiler makes the decision as to what the constancy
factor is for the call (see rules above).  It then follows the rules of const
as if the constancy factor was substituted for inout.

Example, strstr:

inout(char)[] strstr(inout(char)[] source, const(char)[] pattern)
{
   /*
    * this would be an error, as const(char)[] is not implicitly castable to 
    * inout(char)[]
    */
   // return pattern;

   for(int i = 0; i + pattern.length <= source.length; i++)
   {
      inout(char)[] tmp = source[i..pattern.length]; // ok
      if(tmp == pattern) // ok, tmp implicitly casts to const(char)[]
         return source[i..$]; // implicitly casts back to call-site source
   }
   return source[$..$]; // to be consistent with strstr.
}

void f()
{
   auto a = "hello";
   a = strstr(a, "llo"); // cf (constancy factor) == invariant
   // char[] b = strstr(a, "llo"); // error, cannot cast invariant to mutable
   char[] b = "hello".dup;
   b = strstr(b, "llo"); // cf == mutable (note that "llo" doesn't play a role
         // because that parameter is not inout)
   const(char)[] c = strstr(b, "llo"); // cf = mutable, ok because mutable
         // implicitly casts to const
   c = strstr(a, "llo"); // cf = invariant, ok invariant casts to const
}

Now, let's see how min works:

inout(T) min(T)(inout(T) a, inout(T) b)
{
  return a < b ? a : b;
}

void f()
{
   invariant(char)[] i = "hello";
   const(char)[] c = "there";
   char[] m = "Walter".dup;

   //i = min(i, c); // error, since i and c vary in constancy, the result
           // is const, and you cannot implicitly cast const to invariant.

   c = min(i, c); // ok, cf == const, because not homogeneous
   c = min(m, c); // ok, cf == const
   c = min(m, i); // ok, cf == const
   i = min(i, "blah"); // ok, cf == invariant, homogeneous
   //m = min(m, c); // error, cf == const because not homogeneous.  
   //m = min(m, "blah"); // error, cf == const
   m = min(m, "blah".dup); // ok
}

The implementation in the compiler of this produces one version of the function
for all permutations of constancy.  This makes this solution superior over the
template solution (in addition to enforcing const for all versions) in that
there is only one function versus 3^x, where x is the number of inout
parameters.

Optionally, the compiler may generate multiple functions if certain versions of
constancy can be built optimized.  for example, if all the inout parameters are
invariant, the resulting function may be optimized differently.

Also optionally, the overload resolution could allow specializations.  For
example, if invariant allows you to avoid locking mutexes, then you may want to
write a special version of the method where everything is invariant.  In this
context, the overload resolution could match against the inout version last.

Note that the choice of 'inout' as a keyword should not degrade from the
semantic meaning of this enhancement.  If 'inout' is no good as a keyword, then
something else is fine with me.


-- 
Mar 31 2008
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=1961





------- Comment #1 from andrei metalanguage.com  2008-03-31 21:34 -------
Looks interesting. I discussed the issue with Janice a bit. The solution seems
sound, and it's not templated, which is important for virtual functions. My
only concern is that it's a new feature, unrelated to all else in the language,
and as such must be learned. Since some people already complain that the const
regime is already complicated, I fear that people might say that inout adds
even more complication.


-- 
Mar 31 2008
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=1961





------- Comment #3 from andrei metalanguage.com  2008-03-31 22:59 -------
(In reply to comment #2)
 On 01/04/2008, d-bugmail puremagic.com <d-bugmail puremagic.com> wrote:
 Since some people already complain that the const
  regime is already complicated,

Yes, they complain about the fact that the current regime forces us to duplicate many functions three times. (C++ makes us do it twice, which is bad enough!) :-) They complain about the choice of words, "const" and "invariant". They /especially/ complain most furious about the word "enum", and the pointlessness of "in". They complain that "const R f()" looks like f is returning a const(R). But they don't (often) complain about transitivity, or anything that really matters.

I think the choice of words is a classic dilemma. Whatever words are chosen, some won't like them, and they will be the ones to voice their opinion. I think this will just pass. What I'm worried about is arms thrown in the air, "This was supposed to be simpler than C++!" etc. I agree that prefix const for methods is confusing. I personally like const(this), as you proposed. I believe "in" will become more interesting once we figure "scope" out (no pun intended). And enum... you'll have to yank that out from my dead cold hands. Extending enum instead of adding yet another way of defining symbolic constants is The Right Thing to do. I am sure people would have realized how ridiculous the whole "manifest" thing is if we first proposed it. We just can't define one more way for each kind of snow there is. But by and large, as Walter said, although the current constancy regime is excellent, we need to find a good PR consultant :o).
  I fear that people might say that inout adds
  even more complication.

It's a wildcard. People understand wildcards. It's simple enough to explain in terms of "this stands for mutable, const or invariant". From that point of view, it can be presented not as a new flavor of constancy, but as token which stands for any of the existing flavors. I think that with that description, people will get it.

I didn't say they don't understand. I just said they have to run to the manual when they see it. It's yet another thing to learn.
 However, I agree that simplifying const is something we should do
 /first/, before throwing in new stuff. Priorities. :-)

The challenge is how... --
Mar 31 2008
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=1961





------- Comment #5 from aarti interia.pl  2008-04-01 09:15 -------
 And enum... you'll have to yank that out from my dead cold hands. Extending
 enum instead of adding yet another way of defining symbolic constants is The
 Right Thing to do. I am sure people would have realized how ridiculous the
 whole "manifest" thing is if we first proposed it. We just can't define one
 more way for each kind of snow there is.

While I agree completely with what you said about 'manifest' keyword, I quite don't agree about choosing 'enum' as word for manifest constant. IMHO there is much better keyword for this purpose: 'alias'. Please notice that using 'alias' will be very consistent: 1. Alias defines new symbols. Literals (e.g true, false, [], "text")are in fact also symbols as they don't have its address. They exists only inside compiler. So both are of 'same kind'. Personally I think about them as "#define" from C. 2. It seems that there will be also alias syntax like below (as you suggested in [1] comment #4): alias TRUE = true; So there will be no real difference between 'enum' and 'alias' as a way of defining manifest constants. Two ways to define same think seems just one too much. 3. Enum can retain its previous meaning as a way to define *collection*. Please notice that not every set of manifest constant creates collection. Also there can possibly be collections of different things than manifest constants e.g. collections of variables. So there is many ways to extend meaning of enum to make it more programmer friendly and useful. That said I think that enum should only define subset of all manifest constant usages - manifest constants which can be ordered into collection. If alias will be extended to syntax 'alias TRUE=true;' (which seems to be right thing) then there is no way to avoid completely duplication of functionality between 'enum' and 'alias'. But IMHO better way is to specialize 'enum' to mean collection (in specific case collection of manifest constant) and 'alias' to mean definition of symbol. Sorry for being a bit off topic. I can put it into new enhancement req. if you wish. [1] I suggested it already in enhancement request 1827# . http://d.puremagic.com/issues/show_bug.cgi?id=1827 --
Apr 01 2008
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=1961





------- Comment #6 from andrei metalanguage.com  2008-04-01 10:30 -------
(In reply to comment #5)
 And enum... you'll have to yank that out from my dead cold hands. Extending
 enum instead of adding yet another way of defining symbolic constants is The
 Right Thing to do. I am sure people would have realized how ridiculous the
 whole "manifest" thing is if we first proposed it. We just can't define one
 more way for each kind of snow there is.

While I agree completely with what you said about 'manifest' keyword, I quite don't agree about choosing 'enum' as word for manifest constant. IMHO there is much better keyword for this purpose: 'alias'. Please notice that using 'alias' will be very consistent:

These are sensible points. Alias was a close contender. There is one issue with it. Consider: alias x = foo(); To the unwary, this looks like, hey, it's an alias - so whenever I write "x", it will be automagically rewritten into "foo()". In reality CTFE will be done against foo(). Before any other proposal comes forth: Enum does not have this problem and has a learning curve that looks more like a flatline. Anonymous enums already have 99.9% of the needed properties, they only didn't accept other types than integrals. Walter made them accept other types than integrals. It is good language design, it solves the problem, and it removes an unjustified limitation. Why the hubbub over this? We have so much bigger fish (whales!) to fry, it's frivolous to spend time on this one shrimp. Let us move on, please. --
Apr 01 2008
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=1961





------- Comment #7 from brunodomedeiros+bugz gmail.com  2008-04-08 09:42 -------
Andrei & co., please don't discuss topics in the bugzilla, especially if they
are not related to the bug. That's what the NG is for, no?

As for this issue, isn't there a similar problem with member functions and the
this parameter? I.e., suppose you want to write a getter: 

  class Foo { 
    Bar bar;
    Bar getBar() {
      return bar;
    }
  }

But then the return value should have the same constness of the this parameter,
that is, when invoked with a const this, the return value should be const, but
when invoked with a mutable this, the return value should be mutable (and
similarly for invariant). It's the same concept as your inout but applied to
the this parameter. However if the getter is defined as in the code above, it
will only work for mutable this. 

How is this problem solved currently in D? I had the impression there was a way
to do it, but I can't remember it, and I couldn't it find in the documentation
either. Or am I mistaken, and there is no way to do it in D (without creating 3
distinct functions)? 


-- 
Apr 08 2008
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=1961





------- Comment #8 from brunodomedeiros+bugz gmail.com  2008-04-27 13:08 -------
(In reply to comment #7)
 How is this problem solved currently in D? I had the impression there was a way
 to do it, but I can't remember it, and I couldn't it find in the documentation
 either. Or am I mistaken, and there is no way to do it in D (without creating 3
 distinct functions)? 

I've just updated myself on D's const system and on old unread posts about const. I found that I was mistaken, and that D didn't already provide a mechanism to do what "scoped const" does. More unexpectedly, I found that it seems neither Walter nor Andrei had thought of this issue before (I assume this from Andrei because of his "Looks interesting." comment... I could be wrong) I find this woefully disconcerting due to the reasons detailed here: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=70593 . (basicly Walter and Andrei seem to be ignorant of a very good and thorough article about a const system proposal for Java, unnecessarily hitting the wall on several issues) Back to proposal at hand, it seems the main question now is this one: Walter Bright wrote:
 
 For me, the question is is solving these issues a large enough problem 
 that justifies adding a rather confusing new variation on const?

--
Apr 27 2008
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=1961





------- Comment #9 from andrei metalanguage.com  2008-04-27 13:36 -------
(In reply to comment #8)
 More unexpectedly, I found that it seems neither Walter nor Andrei had thought
 of this issue before (I assume this from Andrei because of his "Looks
 interesting." comment... I could be wrong)
 I find this woefully disconcerting due to the reasons detailed here:
 http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=70593
 . (basicly Walter and Andrei seem to be ignorant of a very good and thorough
 article about a const system proposal for Java, unnecessarily hitting the wall
 on several issues)

We were well aware on the work on Javari. The link to the relevant paper has been on our internal discussion wiki for over a year, and I knew of that work long before that. If we've been hitting the wall, it must be stupidity more than ignorance. --
Apr 27 2008
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=1961





------- Comment #10 from brunodomedeiros+bugz gmail.com  2008-04-27 15:12
-------
(In reply to comment #9)
 We were well aware on the work on Javari. The link to the relevant paper has
 been on our internal discussion wiki for over a year, and I knew of that work
 long before that. 

Well then, were you previously aware of Javari's romaybe? --
Apr 27 2008
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=1961





------- Comment #11 from schveiguy yahoo.com  2008-04-28 09:59 -------
(In reply to comment #8)
 Back to proposal at hand, it seems the main question now is this one:
 
 Walter Bright wrote:
 
 For me, the question is is solving these issues a large enough problem 
 that justifies adding a rather confusing new variation on const?


IMO, using this proposal does not create more confusion, it creates less confusion. With this proposal, and with properly implemented library functions one does not - need to worry about whether a template function truly does not molest the parameters - need to worry about which function to call - need to read documentation that contains umpteen different versions of the same function, which vary just on constancy - need to do any special casting to get around deficiencies in the library. And a developer no longer needs to implement umpteen different versions of the same function. I think the result of this proposal being adopted might be that people feel less intimidated by const, if they no longer have to worry about whether they can call functions with any variable types. And the results make intuitive sense. --
Apr 28 2008
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=1961





------- Comment #12 from maxmo pochta.ru  2008-12-02 04:19 -------
 Walter Bright wrote:
 For me, the question is is solving these issues a large enough problem 
 that justifies adding a rather confusing new variation on const?


to provide good rom interface. For me, this entirely prevents use of const system in OOP, restricting it to POD alone. I consider this a large enough problem, worth of new flavor of const. And this feature is not that confusing, if anyone wants a really confusing feature, he can look at the D templates :) --
Dec 02 2008
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=1961


fawzi gmx.ch changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |noshiika gmail.com




------- Comment #13 from fawzi gmx.ch  2008-12-02 06:54 -------
Another kind of solution to this problem, more complex to create, but maybe
cleaner is to have templates to extract/change the constness of types.

I am not familiar with D2, so maybe what I say is already doable, or fully
wrong. In particular this assumes that a type variable T contains also the
constness attributes. This is not the case now (I think) otherwise things like
the min example would be already correct, but I am not fully sure because the
this template arguments seem to pick constness up.

I am not sure I fully grasp the implications of adding constness attributes to
type variables, I think that one problem could be that the matching becomes
more difficult:

with:
  T min(T)(T a,T b)
then
  min(const(int),invariant(int)) should instantiate with T=const(int)
  min(invariant(int),invariant(int)) should instantiate with T=invariant(int)
  min(const(int),int) should instantiate min(int,int) ???
  min(invariant(int),int) should not instantiate

but I see not insormountable problems

Then one wants functions like
naked!(T) // removed eventual const/invariant from T 
   (naked could maybe also be T.base, as .base is already used in
   a similar context)
constness!(T) // returns an object (int?) representing the constness of T
set_constness!(T,const_val) // returns the type
   const(naked(T)), invariant(naked(T)), naked(T) depending on const_val
also strstr can be solved

set_constness!(char,constness!(U)) strstr(U:char[])(U source, char[] pattern){
// ....
}

Using these functions (and assuming constness is ordered, which is trivial to
do) one can then build something like 
min_constness!(...) // returning the minimal constness of its arguments
max_constness!(...) // returning the maximal constness of its arguments
If one also has aliases like "function" to return the argument type tuple (used
in is expressions).
I think that this in combination with template constraints, allows all the
freedom one could possibly want.


-- 
Dec 02 2008
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=1961


fawzi gmx.ch changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|noshiika gmail.com          |




------- Comment #14 from fawzi gmx.ch  2008-12-02 09:00 -------
an obvious possible drawback is code bloat.

If the function that extract/modify the constness are used it is difficult for
the compiler to decide if just one template per base type can be instantiated,
or a different instantiation for each constness variation is needed.


-- 
Dec 02 2008
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=1961





------- Comment #15 from schveiguy yahoo.com  2008-12-02 09:07 -------
(In reply to comment #12)
 This code duplication bug requires to write every rom method in three flavors
 to provide good rom interface. For me, this entirely prevents use of const
 system in OOP, restricting it to POD alone. I consider this a large enough
 problem, worth of new flavor of const. And this feature is not that confusing,
 if anyone wants a really confusing feature, he can look at the D templates :)

Thanks for the support, please vote for the issue so it goes up the list! (In reply to comment #13)
 Another kind of solution to this problem, more complex to create, but maybe
 cleaner is to have templates to extract/change the constness of types.

This works today, in fact it is the same solution I bring up in the beginning of the bug report (with slight variations). But there are several drawbacks: 1. Different arguments create different identical template instantiations. Const does not affect actual assembly generated, it is simply a safeguard for the compiler to enforce. Start creating functions with 2 or 3 const args, and you are exponentially increasing the number of possible template instantiations. 2. As a user of the function, I must read the function to tell if any specialization is done depending on the constness of the type. For example, a static if can easily detect the absence of const and modify the data. In addition, there is no indication that the function works with const types anyways, just seeing a T does not tell me immediately that the function will not modify the args, I must either try to use the function or read the function. 3. Template functions cannot be virtual. 4. What do you do when the return value depends on the constness of multiple arguments? I guess you have a plausible solution, but you still have to mark the function somehow. Using templates, you would need a very convoluted template structure to do this, and you may even need to ensure a certain type of argument. I think my proposal has none of these drawbacks, and all the same benefits as your proposal. The only valid drawback anyone has brought up is the additional const specifier possibly making const even more confusing. Although I still contend that this method would be much easier to understand and explain than any template or set of 3^n different versions of a function. Perhaps the choice of keyword needs to be examined. --
Dec 02 2008
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=1961





------- Comment #16 from fawzi gmx.ch  2008-12-02 09:57 -------
You are right I hadn't fully re-read the your initial posting, it was a long
time ago ;)


-- 
Dec 02 2008
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=1961


smjg iname.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |smjg iname.com




------- Comment #17 from smjg iname.com  2008-12-07 11:57 -------
I like this idea.  It would achieve the const-transparency that would be useful
for COW string functions and the like.

http://tinyurl.com/5dp6fn


-- 
Dec 07 2008
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=1961


Steven Schveighoffer <schveiguy yahoo.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Depends on|                            |3748


--- Comment #18 from Steven Schveighoffer <schveiguy yahoo.com> 2010-02-18
06:55:00 PST ---
It has been agreed that this will be implemented (pretty much as described). 
However, the first attempt by Walter did not work correctly.  I filed bug 3748
that specifically identifies the implementation issues.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Feb 18 2010
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=1961


Steven Schveighoffer <schveiguy yahoo.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED


--- Comment #19 from Steven Schveighoffer <schveiguy yahoo.com> 2010-05-18
03:47:59 PDT ---
Closing this, the feature implementation will be tracked via bug 3748 as a bug
instead of an enhancement.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 18 2010