www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Writing const-correct code in D

reply Kevin Bealer <Kevin_member pathlink.com> writes:
Since people want the benefits of const, I'm showing a way to get them
by following coding conventions.  This requires *no* changes to D.


Also, this is not full "C++ const", only parameter passing and const
methods, which seems to be the most popular parts of the const idea.
It seems like it should require more syntax that C++, but it only
takes a small amount.


When working with types like "int", use "in" - const is not too much
of an issue here.

The same is true for struct, it gets copied in, which is fine for
small structs.  For larger structs, you might want to pass by "in *",
i.e. use "in Foo *".  You can modify this technique to use struct, for
that see the last item in the numbered list at the end.


For classes, the issue is that the pointer will not be modified with
the "in" convention, but the values in the class may be.

: // "Problem" code
:
: class Bar {...}
:
: class Foo {
:   this(Bar b)
:   { x1 = b; }
:
:   this(const_Foo b)
:   {
:     x1 = b.x1.dup;
:   }
:
:   // Modifies this Foo
:   void changeBar(Bar b2)
:   { x1 = b2; }
:
:   // Does not modify this Foo
:   int doesWork() {...}
:
: protected:
:   Bar x1;
: };
:
: // NOTE: changes foo1
: void barfoo(in Foo foo1, in Bar b)
: {
:   foo1.changeBar(b);
: }

We'd like barfoo() not to modify foo1 - we want to guarantee it.

To deal with this, you can write a "const interface" for your class.
I recommend the prefix "const_" so that it looks a little like the C++
version.  This interface definition is quite simple to do.  Note that
a Foo is-a const_Foo, and passing it to a const-Foo interface is
legal.  But modifying it will throw an exception.

NOTE: You don't need any extra method code, except constructors and
optionally the "clone()" method.  What we are doing is SPLITTING the
personality of Foo into two halves - read and write.

: // The read stuff
:
: class const_Foo {
:   this(Bar b)
:   { x1 = b; }
:   
:   this(Foo b)
:   { x1 = b.x1.dup; }
:   
:   // Does not modify Foo
:   int doesWork() {...}
:
:   Foo clone() // how to un-const (optional)
:   {
:      return new Foo(this); // use const->nonconst ctor
:   }
:
: protected:
:   Bar x1;
: };
:
: // The write stuff - can also do read stuff of course.
:
: class Foo : const_Foo {
:   this(Bar b)
:   {
:     const_Foo(b);
:   }
:   
:   this(const_Foo b) // const->nonconst ctor
:   {
:     const_Foo(b.dup);
:   }
:   
:   void changeBar(Bar b2)
:   {
:     x1 = b2;
:   }
: };
:
: // Can only call this with non-const Foo.
: void barfoo(in Foo foo1, in Bar b)
: {
:   foo1.changeBar(b);
: }
: 
: // Can call this with either const_Foo or Foo.
: void barfaa(in const_Foo foo1)
: {
:   int q = foo1.doesWork();
: }

1. In C++, you need to make the same division into const and
non-const, since every method must be labeled as "const" or not
labeled (and thus unusable in a const object).  So there is no extra
"design burden".

2. You can easily change any method's constness by cut/pasting it to
the other class.  All implementation code/data is shared.

3. Relationships are enforced!  If doesWork() calls changeBar(), the
compiler will complain.

4. The class author decides whether "clone()" and the other special
methods are written at all - so if "Bar" is uncloneable for some
reason (i.e. maybe its a File), don't write clone() for Foo, or find a
way to get around copying it.  This work needs to be done in C++ too.

5. Users of const_Foo don't need to know what the editable Foo does.
Their code can't break unless the const_ side is changed.  It's now
very hard to miss the distinction between const/non-const, which is
easy to miss in C++ when writing methods for example.

6. Easy to use as a Copy-On-Write design: If you need to store an
object, and don't know if it is const or not, use a const_Foo
reference.  In the event you need to modify it, you can test whether
it is const with a dynamic cast.  If it is, clone it first!

7. In C++, you can also define distinct const and non-const methods
for a class.  This happens automatically here - the non-const method
(if one exists) just overrides the const one.

8. Finally, for OOD/OOP purists: Although the non-const version is not
really "is-a" const, the relationship still holds once you realize
that const is really a "subtracting" adjective - we could use the
terms readable and read/writeable, where it is easy to see that a
read/writeable think is-a readable thing.

9. You can have "in Foo" parameters and "out const_Foo" without it
being a contradiction.  The first means "I don't want to change what
it points to -- something the caller might also want to know -- but I
might modify it.  The second is a way to return something.  [The
semantics of input and output (argument and return value) are normally
different in OO programming, since one is covariant and the other
contravariant. (This is true in D, right?)]

10. For structs you can do a similar thing:

: // read-write version
: struct X {
:    int opIndex(int i) { ... }
:    int opIndexAssign(int i) { ... }
:    
: private:
:    int[1024] data_;
: };

: // read-only version
: struct const_X {
:    int opIndex(int i) { return impl[i]; }
:    
:    X * clone()
:    {
:       return impl.dup;
:    }
:    
: private:
:    X impl;
: };

If people like this, maybe something along these lines would be useful
for the C++ programmer intro on the D site?  I can make a more
thorough version if so.  If people use this technique, it might be
good for them to follow the same style, i.e. method names.

Kevin
Mar 08 2006
next sibling parent reply Brad Roberts <braddr puremagic.com> writes:
On Thu, 9 Mar 2006, Kevin Bealer wrote:

 The same is true for struct, it gets copied in, which is fine for
 small structs.  For larger structs, you might want to pass by "in *",
 i.e. use "in Foo *".  You can modify this technique to use struct, for
 that see the last item in the numbered list at the end.
 
 
 For classes, the issue is that the pointer will not be modified with
 the "in" convention, but the values in the class may be.
 
 :   this(const_Foo b)
 :   {
 :     x1 = b.x1.dup;
 :   }
I must have missed something somewhere along the way.. when did copying imply const? To me, something that's const can't be modified. That doesn't mean just to the caller, but also to the callee. It's a mechanism for saying "this object shouldn't be changed". Const by duplication doesn't help with the last part and make it entirely probable that code will at some point be change to modify parts of the passed in data with the expectation that those changes actually occur on up through to the caller. Sorry, const by dup is in some ways even worse than not having const. I see how it solves some usecases though, so it's not totally worse. :) Later, Brad
Mar 08 2006
parent Kevin Bealer <Kevin_member pathlink.com> writes:
In article <Pine.LNX.4.64.0603081855530.30259 bellevue.puremagic.com>, Brad
Roberts says...
On Thu, 9 Mar 2006, Kevin Bealer wrote:

 The same is true for struct, it gets copied in, which is fine for
 small structs.  For larger structs, you might want to pass by "in *",
 i.e. use "in Foo *".  You can modify this technique to use struct, for
 that see the last item in the numbered list at the end.
 
 
 For classes, the issue is that the pointer will not be modified with
 the "in" convention, but the values in the class may be.
 
 :   this(const_Foo b)
 :   {
 :     x1 = b.x1.dup;
 :   }
I must have missed something somewhere along the way.. when did copying imply const? To me, something that's const can't be modified. That doesn't mean just to the caller, but also to the callee. It's a mechanism for saying "this object shouldn't be changed". Const by duplication doesn't help with the last part and make it entirely probable that code will at some point be change to modify parts of the passed in data with the expectation that those changes actually occur on up through to the caller. Sorry, const by dup is in some ways even worse than not having const. I see how it solves some usecases though, so it's not totally worse. :) Later, Brad
Yeah - this is a tradeoff, but as I understand it, the copy constructor in D (unlike C++) can't be used for automatic conversions. So the person receiving a const_Foo has to do this: : void dofoo(in const_Foo x) : { : Foo y = new Foo(x); // explicit copy of value(s) from x : ... : } ..in order to get a new one. Now, normally one would not expect y to propagate changes back to x, if the syntax looks like the above, right? These versions won't even compile: : void doAAA(in Foo x) : { : y.modifyStuff(); // okay here : } : : void doBBB(in const_Foo x) : { : y.modifyStuff(); // failure here - const_Foo doesn't have this method : } : : const_Foo bar; : : doAAA(bar); // error: can't convert const_Foo to Foo : doBBB(bar); // okay here So -- both are caught at *compile* time. No .duplication unless requested explicitely. The clone() method and clone() constructors are designed to do as deep of a copy as necessary, which means they need to be user defined. That's why all the proposals for deep .dup don't work - it requires developer input, the compiler doesn't know enough. If you wanted to really prevent modification or copy, you can just omit that method, and not provide a way to go from const_Foo to Foo. I'm proposing this as the obvious standard way to de-const -- with a constructor -- for cases where you want that behavior. My thinking is that we could set up rules for people who want to do const, maybe because they have a huge C++ project they are rewriting in D, and it uses const in complex ways. Maybe because they just like the const facility. Kevin
Mar 08 2006
prev sibling parent reply xs0 <xs0 xs0.com> writes:
Kevin Bealer wrote:
 Since people want the benefits of const, I'm showing a way to get them
 by following coding conventions.  This requires *no* changes to D.
 
 
 Also, this is not full "C++ const", only parameter passing and const
 methods, which seems to be the most popular parts of the const idea.
 It seems like it should require more syntax that C++, but it only
 takes a small amount.
 
 
 When working with types like "int", use "in" - const is not too much
 of an issue here.
 
 The same is true for struct, it gets copied in, which is fine for
 small structs.  For larger structs, you might want to pass by "in *",
 i.e. use "in Foo *".  You can modify this technique to use struct, for
 that see the last item in the numbered list at the end.
 
 
 For classes, the issue is that the pointer will not be modified with
 the "in" convention, but the values in the class may be.
 
 : // "Problem" code
 :
 : class Bar {...}
 :
 : class Foo {
 :   this(Bar b)
 :   { x1 = b; }
 :
 :   this(const_Foo b)
 :   {
 :     x1 = b.x1.dup;
 :   }
 :
 :   // Modifies this Foo
 :   void changeBar(Bar b2)
 :   { x1 = b2; }
 :
 :   // Does not modify this Foo
 :   int doesWork() {...}
 :
 : protected:
 :   Bar x1;
 : };
 :
 : // NOTE: changes foo1
 : void barfoo(in Foo foo1, in Bar b)
 : {
 :   foo1.changeBar(b);
 : }
 
 We'd like barfoo() not to modify foo1 - we want to guarantee it.
 
 To deal with this, you can write a "const interface" for your class.
 I recommend the prefix "const_" so that it looks a little like the C++
 version.  This interface definition is quite simple to do.  Note that
 a Foo is-a const_Foo, and passing it to a const-Foo interface is
 legal.  But modifying it will throw an exception.
 
 NOTE: You don't need any extra method code, except constructors and
 optionally the "clone()" method.  What we are doing is SPLITTING the
 personality of Foo into two halves - read and write.
 
 : // The read stuff
 :
 : class const_Foo {
 :   this(Bar b)
 :   { x1 = b; }
 :   
 :   this(Foo b)
 :   { x1 = b.x1.dup; }
 :   
 :   // Does not modify Foo
 :   int doesWork() {...}
 :
 :   Foo clone() // how to un-const (optional)
 :   {
 :      return new Foo(this); // use const->nonconst ctor
 :   }
 :
 : protected:
 :   Bar x1;
 : };
 :
 : // The write stuff - can also do read stuff of course.
 :
 : class Foo : const_Foo {
 :   this(Bar b)
 :   {
 :     const_Foo(b);
 :   }
 :   
 :   this(const_Foo b) // const->nonconst ctor
 :   {
 :     const_Foo(b.dup);
 :   }
 :   
 :   void changeBar(Bar b2)
 :   {
 :     x1 = b2;
 :   }
 : };
 :
 : // Can only call this with non-const Foo.
 : void barfoo(in Foo foo1, in Bar b)
 : {
 :   foo1.changeBar(b);
 : }
 : 
 : // Can call this with either const_Foo or Foo.
 : void barfaa(in const_Foo foo1)
 : {
 :   int q = foo1.doesWork();
 : }
 
 1. In C++, you need to make the same division into const and
 non-const, since every method must be labeled as "const" or not
 labeled (and thus unusable in a const object).  So there is no extra
 "design burden".
 
 2. You can easily change any method's constness by cut/pasting it to
 the other class.  All implementation code/data is shared.
 
 3. Relationships are enforced!  If doesWork() calls changeBar(), the
 compiler will complain.
 
 4. The class author decides whether "clone()" and the other special
 methods are written at all - so if "Bar" is uncloneable for some
 reason (i.e. maybe its a File), don't write clone() for Foo, or find a
 way to get around copying it.  This work needs to be done in C++ too.
 
 5. Users of const_Foo don't need to know what the editable Foo does.
 Their code can't break unless the const_ side is changed.  It's now
 very hard to miss the distinction between const/non-const, which is
 easy to miss in C++ when writing methods for example.
 
 6. Easy to use as a Copy-On-Write design: If you need to store an
 object, and don't know if it is const or not, use a const_Foo
 reference.  In the event you need to modify it, you can test whether
 it is const with a dynamic cast.  If it is, clone it first!
 
 7. In C++, you can also define distinct const and non-const methods
 for a class.  This happens automatically here - the non-const method
 (if one exists) just overrides the const one.
 
 8. Finally, for OOD/OOP purists: Although the non-const version is not
 really "is-a" const, the relationship still holds once you realize
 that const is really a "subtracting" adjective - we could use the
 terms readable and read/writeable, where it is easy to see that a
 read/writeable think is-a readable thing.
 
 9. You can have "in Foo" parameters and "out const_Foo" without it
 being a contradiction.  The first means "I don't want to change what
 it points to -- something the caller might also want to know -- but I
 might modify it.  The second is a way to return something.  [The
 semantics of input and output (argument and return value) are normally
 different in OO programming, since one is covariant and the other
 contravariant. (This is true in D, right?)]
 
 10. For structs you can do a similar thing:
 
 : // read-write version
 : struct X {
 :    int opIndex(int i) { ... }
 :    int opIndexAssign(int i) { ... }
 :    
 : private:
 :    int[1024] data_;
 : };
 
 : // read-only version
 : struct const_X {
 :    int opIndex(int i) { return impl[i]; }
 :    
 :    X * clone()
 :    {
 :       return impl.dup;
 :    }
 :    
 : private:
 :    X impl;
 : };
 
 If people like this, maybe something along these lines would be useful
 for the C++ programmer intro on the D site?  I can make a more
 thorough version if so.  If people use this technique, it might be
 good for them to follow the same style, i.e. method names.
 
 Kevin
OK, while it might work, your approach has several problems: - applicability - there is no support for arrays and pointers to primitive types; especially arrays are a problem - coding efficiency - maintenance of an additional class is bad, having to write a wrapper for structs is even worse - while arbitrarily flexible, the approach is also error-prone, like all manual methods - runtime efficiency - say you have a struct in your object; you can't return a readonly pointer to it, so you either have to heap-allocate a new struct, or make the wrapper contain a pointer, incurring double dereferencing - if you do use a pointer, you have to heap-allocate the wrapped struct, otherwise you can't pass the read-only version around freely - two classes means two vtbls, two TypeInfos, ... Any thoughts on that? :) xs0
Mar 09 2006
parent Kevin Bealer <Kevin_member pathlink.com> writes:
In article <dupfcr$erh$1 digitaldaemon.com>, xs0 says...
Kevin Bealer wrote:
 Since people want the benefits of const, I'm showing a way to get them
 by following coding conventions.  This requires *no* changes to D.
 
 
 Also, this is not full "C++ const", only parameter passing and const
 methods, which seems to be the most popular parts of the const idea.
 It seems like it should require more syntax that C++, but it only
 takes a small amount.
 
 
 When working with types like "int", use "in" - const is not too much
 of an issue here.
 
 The same is true for struct, it gets copied in, which is fine for
 small structs.  For larger structs, you might want to pass by "in *",
 i.e. use "in Foo *".  You can modify this technique to use struct, for
 that see the last item in the numbered list at the end.
 
 
 For classes, the issue is that the pointer will not be modified with
 the "in" convention, but the values in the class may be.
 
 : // "Problem" code
 :
 : class Bar {...}
 :
 : class Foo {
 :   this(Bar b)
 :   { x1 = b; }
 :
 :   this(const_Foo b)
 :   {
 :     x1 = b.x1.dup;
 :   }
 :
 :   // Modifies this Foo
 :   void changeBar(Bar b2)
 :   { x1 = b2; }
 :
 :   // Does not modify this Foo
 :   int doesWork() {...}
 :
 : protected:
 :   Bar x1;
 : };
 :
 : // NOTE: changes foo1
 : void barfoo(in Foo foo1, in Bar b)
 : {
 :   foo1.changeBar(b);
 : }
 
 We'd like barfoo() not to modify foo1 - we want to guarantee it.
 
 To deal with this, you can write a "const interface" for your class.
 I recommend the prefix "const_" so that it looks a little like the C++
 version.  This interface definition is quite simple to do.  Note that
 a Foo is-a const_Foo, and passing it to a const-Foo interface is
 legal.  But modifying it will throw an exception.
 
 NOTE: You don't need any extra method code, except constructors and
 optionally the "clone()" method.  What we are doing is SPLITTING the
 personality of Foo into two halves - read and write.
 
 : // The read stuff
 :
 : class const_Foo {
 :   this(Bar b)
 :   { x1 = b; }
 :   
 :   this(Foo b)
 :   { x1 = b.x1.dup; }
 :   
 :   // Does not modify Foo
 :   int doesWork() {...}
 :
 :   Foo clone() // how to un-const (optional)
 :   {
 :      return new Foo(this); // use const->nonconst ctor
 :   }
 :
 : protected:
 :   Bar x1;
 : };
 :
 : // The write stuff - can also do read stuff of course.
 :
 : class Foo : const_Foo {
 :   this(Bar b)
 :   {
 :     const_Foo(b);
 :   }
 :   
 :   this(const_Foo b) // const->nonconst ctor
 :   {
 :     const_Foo(b.dup);
 :   }
 :   
 :   void changeBar(Bar b2)
 :   {
 :     x1 = b2;
 :   }
 : };
 :
 : // Can only call this with non-const Foo.
 : void barfoo(in Foo foo1, in Bar b)
 : {
 :   foo1.changeBar(b);
 : }
 : 
 : // Can call this with either const_Foo or Foo.
 : void barfaa(in const_Foo foo1)
 : {
 :   int q = foo1.doesWork();
 : }
 
 1. In C++, you need to make the same division into const and
 non-const, since every method must be labeled as "const" or not
 labeled (and thus unusable in a const object).  So there is no extra
 "design burden".
 
 2. You can easily change any method's constness by cut/pasting it to
 the other class.  All implementation code/data is shared.
 
 3. Relationships are enforced!  If doesWork() calls changeBar(), the
 compiler will complain.
 
 4. The class author decides whether "clone()" and the other special
 methods are written at all - so if "Bar" is uncloneable for some
 reason (i.e. maybe its a File), don't write clone() for Foo, or find a
 way to get around copying it.  This work needs to be done in C++ too.
 
 5. Users of const_Foo don't need to know what the editable Foo does.
 Their code can't break unless the const_ side is changed.  It's now
 very hard to miss the distinction between const/non-const, which is
 easy to miss in C++ when writing methods for example.
 
 6. Easy to use as a Copy-On-Write design: If you need to store an
 object, and don't know if it is const or not, use a const_Foo
 reference.  In the event you need to modify it, you can test whether
 it is const with a dynamic cast.  If it is, clone it first!
 
 7. In C++, you can also define distinct const and non-const methods
 for a class.  This happens automatically here - the non-const method
 (if one exists) just overrides the const one.
 
 8. Finally, for OOD/OOP purists: Although the non-const version is not
 really "is-a" const, the relationship still holds once you realize
 that const is really a "subtracting" adjective - we could use the
 terms readable and read/writeable, where it is easy to see that a
 read/writeable think is-a readable thing.
 
 9. You can have "in Foo" parameters and "out const_Foo" without it
 being a contradiction.  The first means "I don't want to change what
 it points to -- something the caller might also want to know -- but I
 might modify it.  The second is a way to return something.  [The
 semantics of input and output (argument and return value) are normally
 different in OO programming, since one is covariant and the other
 contravariant. (This is true in D, right?)]
 
 10. For structs you can do a similar thing:
 
 : // read-write version
 : struct X {
 :    int opIndex(int i) { ... }
 :    int opIndexAssign(int i) { ... }
 :    
 : private:
 :    int[1024] data_;
 : };
 
 : // read-only version
 : struct const_X {
 :    int opIndex(int i) { return impl[i]; }
 :    
 :    X * clone()
 :    {
 :       return impl.dup;
 :    }
 :    
 : private:
 :    X impl;
 : };
 
 If people like this, maybe something along these lines would be useful
 for the C++ programmer intro on the D site?  I can make a more
 thorough version if so.  If people use this technique, it might be
 good for them to follow the same style, i.e. method names.
 
 Kevin
OK, while it might work, your approach has several problems: - applicability - there is no support for arrays and pointers to primitive types; especially arrays are a problem
True. This could be done with a wrapper too, particularly with IFTI, but coding efficiency suffers a little.
- coding efficiency
    - maintenance of an additional class is bad, having to write
      a wrapper for structs is even worse
C++ requires you to maintain two personalities in one class - you have to make the same decisions, and can make most of the same errors.
    - while arbitrarily flexible, the approach is also error-prone,
      like all manual methods
If you don't write the special methods, there is very little extra work to do, really just adding "class A : B {" and "};" to your code. It should not be much more error prone. If you do write the special methods (clone, special constructors) then this is true. So I agree, partially... see my comments at the end.
- runtime efficiency
    - say you have a struct in your object; you can't return a readonly
      pointer to it, so you either have to heap-allocate a new struct,
      or make the wrapper contain a pointer, incurring double
      dereferencing
But, by heap allocating it, you avoid the cost of copying it during the return. The way I see it, you have three options: 1. Return by value - in which case you probably don't need const, unless the object contains stuff that you want to protect. 2. Wrap in a readonly struct (no pointers) and return this by value. The copy into the readonly struct costs efficiency, about the same as returning the original by value. 3. Heap allocations... avoids future copies (good), requires heap allocation (bad). So, a tradeoff. I admit though, the approach doesn't work as well for struct as for class.
    - if you do use a pointer, you have to heap-allocate the wrapped
      struct, otherwise you can't pass the read-only version around
      freely
I think you can wrap in a struct and pass the wrapper by value, which copies the internal struct. The value is copied as with any struct, but the wrapper doesn't have the non-const methods, right? A bigger annoyance is the fact that you can't write one function signature that takes either type, unless it takes a template parameter. I.e. structs don't have inheritance, so there's no Foo -> const_Foo implicit conversion.
    - two classes means two vtbls, two TypeInfos, ...
Small potatoes, and not entirely harmful (see below).
Any thoughts on that? :)


xs0
You're right - this approach has definite limits and is not free. However, it allows some flexibility that C++ (for instance) does not. :const_Foo f = get_Object(); : :Foo f_mod = cast(Foo) f; : :if (f_mod !is null) { : // modify f, only if allowed :} With this technique, you can create real "const_Foo" objects (i.e. they were never a Foo) and test which kind you have at runtime. Normally all "Foo" objects are created as "Foo", but this approach enables you to build objects that are designed to never change - someone has to do the clone operation to get a modifiable version. This could be really useful in (for instance) cache designs - get an object from the cache, pass it anywhere, since the const_Foo class uses "immutable" semantics (like a Java String), this is safe to do. It requires classes that can be built entirely in the constructor, but many people write code like this now. Similarly, I could create a Baz object that has all the readonly methods of Foo, and derive it from const_Foo. Now you can treat it as a const_Foo, but cannot cast it to a Foo (since it isnt one). Why would I do this? Imagine I have a string class Foo - I could create a const_Foo derived class where the actual data was a memory mapped file. It has all the *readonly* properties of a string, but you can't resize it. Or a readonly version of a database interface, which would dynamically provide DB information for user queries, but would not allow storage into the database. I could use this database interface as a front end for any number of data sources that are not really databases. You can't do any of these in C++, since C++'s const acts like a template - all the work happens at compile time. Having an extra vtbl, allows you to do runtime tricks too - and have compile time static type correctness for code that just uses Foo without thinking about const. Kevin
Mar 09 2006