www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Writing const-correct code in D

reply Kevin Bealer <Kevin_member pathlink.com> writes:
Also, this is not full "C++ const", only parameter passing and const
methods, which seems to be the most popular parts of the const idea.
It seems like it should require more syntax that C++, but it only
takes a small amount.


When working with types like "int", use "in" - const is not too much
of an issue here.

The same is true for struct, it gets copied in, which is fine for
small structs.  For larger structs, you might want to pass by "in *",
i.e. use "in Foo *".  You can modify this technique to use struct, for
that see the last item in the numbered list at the end.


For classes, the issue is that the pointer will not be modified with
the "in" convention, but the values in the class may be.

 // "Problem" code
Mar 07 2006
next sibling parent "Lionello Lunesu" <lio remove.lunesu.com> writes:
eeeeeeeeeeee, a pointer! I thought we got rid of those...

L.

"Kevin Bealer" <Kevin_member pathlink.com> wrote in message 
news:duluq5$11ki$1 digitaldaemon.com...
 Also, this is not full "C++ const", only parameter passing and const
 methods, which seems to be the most popular parts of the const idea.
 It seems like it should require more syntax that C++, but it only
 takes a small amount.


 When working with types like "int", use "in" - const is not too much
 of an issue here.

 The same is true for struct, it gets copied in, which is fine for
 small structs.  For larger structs, you might want to pass by "in *",
 i.e. use "in Foo *".  You can modify this technique to use struct, for
 that see the last item in the numbered list at the end.


 For classes, the issue is that the pointer will not be modified with
 the "in" convention, but the values in the class may be.

 // "Problem" code 

Mar 08 2006
prev sibling next sibling parent reply xs0 <xs0 xs0.com> writes:
Kevin Bealer wrote:
 Also, this is not full "C++ const", only parameter passing and const
 methods, which seems to be the most popular parts of the const idea.
 It seems like it should require more syntax that C++, but it only
 takes a small amount.
 
 
 When working with types like "int", use "in" - const is not too much
 of an issue here.
 
 The same is true for struct, it gets copied in, which is fine for
 small structs.  For larger structs, you might want to pass by "in *",
 i.e. use "in Foo *".  You can modify this technique to use struct, for
 that see the last item in the numbered list at the end.
 
 
 For classes, the issue is that the pointer will not be modified with
 the "in" convention, but the values in the class may be.
 
  // "Problem" code

OK, this is like the 5000th post I've read regarding const correctness and related issues in D. Can we really not come to some kind of an agreement on what would be best? I'm sure if there's a consensus about a solution, Walter will eventually implement it. I've read the paper Andrew posted a link to in the last const thread, and I really like that system: http://pag.csail.mit.edu/pubs/ref-immutability-oopsla2005-abstract.html Walter, have you also read it? What do you think? Here's a bad summary, if you don't feel like reading 20 pages :) References/pointers have two properties, assignability and mutability. Assignability is already handled in D by declaring something const, which prevents reassignment, but there is no notion of immutability. Javari introduces a readonly keyword, which applies to a reference. When a reference/pointer is readonly, it means that the data it points and also all transitively reachable data cannot be changed. Note that there is no implication that the data will not change through other references. And, obviously, you can't assign a readonly reference into a mutable var, and you can do the opposite. Well, that's the gist of it, other features in random order include: - overloading based on mutability of this: class StringWithDate { Date getDate() { return m_date; } // returns a mutable Date, can be called // only through a mutable reference readonly Date getDate() readonly { return m_date; } // - the second readonly applies to this // - can be called only through a readonly ref // - the Date returned could still be mutable // but probably the implementation would dup it first } - romaybe keyword for simple cases like above: romaybe Date getDate() romaybe { return m_date; } // is exactly the same as the two funcs above; romaybe // basically expands into two methods, one replaces // all romaybes with readonly, the other with mutable - readonly classes: readonly class ConstString { ... } // will make all references to ConstString immutable // (much like how auto classes work) - conceptually, each class definition (say Foo : Bar) produces two new types, "readonly Foo : readonly Bar" and "Foo : readonly Foo, Bar", the first of which only contains readonly methods, while the second contains all others. That makes it trivial to do verification and overloading, and has a nice feature that there is no need to actually compile the readonly version, as all verification is done statically, so there is no increase in code size etc. - one can still explicitly allow changing an object's fields even through readonly references, which is useful for things like caching hashcodes, logging, etc., which do not change an object's "abstract state" but do still have side effects The problem of having to write two versions of functions, depending on mutability, is somewhat helped by "romaybe", and could be further eased if the compiler did some simple inference on its own: - class fields are "important" by default and "ignorable" if they are explicitly declared "mutable" or "readonly" (readonly is ignorable, because it cannot be changed in the first place; mutable actually declares the field to be ignorable) - any method that could write to important fields or could call non-readonly method on them is inferred to be mutable, unless specified otherwise - other methods are considered readonly, unless specified otherwise - "in" parameters are resolved analogous to above, "inout" and "out" default to mutable - return values default to mutable So, would a system like this be acceptable? xs0
Mar 08 2006
next sibling parent reply Johan Granberg <lijat.meREM OVEgmail.com> writes:
xs0 wrote:
 Kevin Bealer wrote:
 Also, this is not full "C++ const", only parameter passing and const
 methods, which seems to be the most popular parts of the const idea.
 It seems like it should require more syntax that C++, but it only
 takes a small amount.


 When working with types like "int", use "in" - const is not too much
 of an issue here.

 The same is true for struct, it gets copied in, which is fine for
 small structs.  For larger structs, you might want to pass by "in *",
 i.e. use "in Foo *".  You can modify this technique to use struct, for
 that see the last item in the numbered list at the end.


 For classes, the issue is that the pointer will not be modified with
 the "in" convention, but the values in the class may be.

  // "Problem" code

OK, this is like the 5000th post I've read regarding const correctness and related issues in D. Can we really not come to some kind of an agreement on what would be best? I'm sure if there's a consensus about a solution, Walter will eventually implement it. I've read the paper Andrew posted a link to in the last const thread, and I really like that system: http://pag.csail.mit.edu/pubs/ref-immutability-oopsla2005-abstract.html Walter, have you also read it? What do you think? Here's a bad summary, if you don't feel like reading 20 pages :) References/pointers have two properties, assignability and mutability. Assignability is already handled in D by declaring something const, which prevents reassignment, but there is no notion of immutability. Javari introduces a readonly keyword, which applies to a reference. When a reference/pointer is readonly, it means that the data it points and also all transitively reachable data cannot be changed. Note that there is no implication that the data will not change through other references. And, obviously, you can't assign a readonly reference into a mutable var, and you can do the opposite. Well, that's the gist of it, other features in random order include: - overloading based on mutability of this: class StringWithDate { Date getDate() { return m_date; } // returns a mutable Date, can be called // only through a mutable reference readonly Date getDate() readonly { return m_date; } // - the second readonly applies to this // - can be called only through a readonly ref // - the Date returned could still be mutable // but probably the implementation would dup it first } - romaybe keyword for simple cases like above: romaybe Date getDate() romaybe { return m_date; } // is exactly the same as the two funcs above; romaybe // basically expands into two methods, one replaces // all romaybes with readonly, the other with mutable - readonly classes: readonly class ConstString { ... } // will make all references to ConstString immutable // (much like how auto classes work) - conceptually, each class definition (say Foo : Bar) produces two new types, "readonly Foo : readonly Bar" and "Foo : readonly Foo, Bar", the first of which only contains readonly methods, while the second contains all others. That makes it trivial to do verification and overloading, and has a nice feature that there is no need to actually compile the readonly version, as all verification is done statically, so there is no increase in code size etc. - one can still explicitly allow changing an object's fields even through readonly references, which is useful for things like caching hashcodes, logging, etc., which do not change an object's "abstract state" but do still have side effects The problem of having to write two versions of functions, depending on mutability, is somewhat helped by "romaybe", and could be further eased if the compiler did some simple inference on its own: - class fields are "important" by default and "ignorable" if they are explicitly declared "mutable" or "readonly" (readonly is ignorable, because it cannot be changed in the first place; mutable actually declares the field to be ignorable) - any method that could write to important fields or could call non-readonly method on them is inferred to be mutable, unless specified otherwise - other methods are considered readonly, unless specified otherwise - "in" parameters are resolved analogous to above, "inout" and "out" default to mutable - return values default to mutable So, would a system like this be acceptable? xs0

Mar 08 2006
next sibling parent xs0 <xs0 xs0.com> writes:
 I would like it this is how I have been using const in c++

Well, there are some differences from C++ const: - from what I gather, const is shallow, not deep - there is no mutability inference in C++ - there's no romaybe, so lots of code is written twice without any real need for it - support for arrays is limited - the name is much worse :) See the paper for a longer discussion. xs0
Mar 08 2006
prev sibling parent reply Don Clugston <dac nospam.com.au> writes:
Johan Granberg wrote:
 xs0 wrote:
 Kevin Bealer wrote:
 Also, this is not full "C++ const", only parameter passing and const
 methods, which seems to be the most popular parts of the const idea.
 It seems like it should require more syntax that C++, but it only
 takes a small amount.


 When working with types like "int", use "in" - const is not too much
 of an issue here.

 The same is true for struct, it gets copied in, which is fine for
 small structs.  For larger structs, you might want to pass by "in *",
 i.e. use "in Foo *".  You can modify this technique to use struct, for
 that see the last item in the numbered list at the end.


 For classes, the issue is that the pointer will not be modified with
 the "in" convention, but the values in the class may be.

  // "Problem" code

OK, this is like the 5000th post I've read regarding const correctness and related issues in D. Can we really not come to some kind of an agreement on what would be best? I'm sure if there's a consensus about a solution, Walter will eventually implement it. I've read the paper Andrew posted a link to in the last const thread, and I really like that system: http://pag.csail.mit.edu/pubs/ref-immutability-oopsla2005-abstract.html Walter, have you also read it? What do you think? Here's a bad summary, if you don't feel like reading 20 pages :) References/pointers have two properties, assignability and mutability. Assignability is already handled in D by declaring something const, which prevents reassignment, but there is no notion of immutability. Javari introduces a readonly keyword, which applies to a reference. When a reference/pointer is readonly, it means that the data it points and also all transitively reachable data cannot be changed. Note that there is no implication that the data will not change through other references. And, obviously, you can't assign a readonly reference into a mutable var, and you can do the opposite. Well, that's the gist of it, other features in random order include: - overloading based on mutability of this: class StringWithDate { Date getDate() { return m_date; } // returns a mutable Date, can be called // only through a mutable reference readonly Date getDate() readonly { return m_date; } // - the second readonly applies to this // - can be called only through a readonly ref // - the Date returned could still be mutable // but probably the implementation would dup it first } - romaybe keyword for simple cases like above: romaybe Date getDate() romaybe { return m_date; } // is exactly the same as the two funcs above; romaybe // basically expands into two methods, one replaces // all romaybes with readonly, the other with mutable - readonly classes: readonly class ConstString { ... } // will make all references to ConstString immutable // (much like how auto classes work) - conceptually, each class definition (say Foo : Bar) produces two new types, "readonly Foo : readonly Bar" and "Foo : readonly Foo, Bar", the first of which only contains readonly methods, while the second contains all others. That makes it trivial to do verification and overloading, and has a nice feature that there is no need to actually compile the readonly version, as all verification is done statically, so there is no increase in code size etc. - one can still explicitly allow changing an object's fields even through readonly references, which is useful for things like caching hashcodes, logging, etc., which do not change an object's "abstract state" but do still have side effects The problem of having to write two versions of functions, depending on mutability, is somewhat helped by "romaybe", and could be further eased if the compiler did some simple inference on its own: - class fields are "important" by default and "ignorable" if they are explicitly declared "mutable" or "readonly" (readonly is ignorable, because it cannot be changed in the first place; mutable actually declares the field to be ignorable) - any method that could write to important fields or could call non-readonly method on them is inferred to be mutable, unless specified otherwise - other methods are considered readonly, unless specified otherwise - "in" parameters are resolved analogous to above, "inout" and "out" default to mutable - return values default to mutable So, would a system like this be acceptable? xs0


I think having seperate overloads for const and non-const parameters is a design mistake in C++. I see C++ 'const' as a compile-time contract, not a type. IMHO, overloading const vs non-const is like overloading functions based on their contracts. I suspect that 'const' has exaggerated importance to those of us from a C++ background, because it's almost the only contract that C++ provides. Maybe the solution to 'const' is a more general compile-time contract mechanism.
Mar 08 2006
next sibling parent xs0 <xs0 xs0.com> writes:
Don Clugston wrote:
 Johan Granberg wrote:
 xs0 wrote:
 So, would a system like this be acceptable?


I think having seperate overloads for const and non-const parameters is a design mistake in C++. I see C++ 'const' as a compile-time contract, not a type. IMHO, overloading const vs non-const is like overloading functions based on their contracts. I suspect that 'const' has exaggerated importance to those of us from a C++ background, because it's almost the only contract that C++ provides. Maybe the solution to 'const' is a more general compile-time contract mechanism.

Well, stepping back a little, it is indeed about contracts, specifically preventing one function/thread to modify data from another function/thread (where the idea is not to offer any absolute guarantees, but rather to make it impossible to write bad code unintentionally). Unlike you, though, I think this type of contract is so common, it deserves a special mechanism. In any case, two questions arise 1) what exactly does it mean to modify something (specifically a composite type like an object) 2) how to design language support so it is able to express those contracts in a simple manner (it should be both simple to use and simple to implement) Considering it's impossible for the compiler to know what you consider an object's state, you'll have to annotate fields and methods in some cases, one way or another. Considering that, the defaults proposed seem reasonable to me: - fields are considered to contribute to object's state, unless specified otherwise (as most fields are indeed part of state) - methods are inferred to read/modify the state based on which fields they modify Just looking at random classes in my workspace, I'd say those will get at least 95% of classes correct, without the coder doing any work at all. The only thing remaining is to specify which of the references you want to be read-only (which again you need to do in any case). There are basically two cases that need to be covered (in others, mutability is more or less implied): - a function wants to prevent modification of the return value (both using "return" or "out") - a caller wants to prevent modification of "in" parameters Again, I see no other solution than to annotate both, but unlike with C++, inference on mutability of in parameters will mean that you won't have to annotate each and every in parameter which remains unchanged, but rather only those where it is not obvious. Finally, as far as "two types per type" are concerned, it was more of an implementation detail than anything else - by treating most of the compilation this way, you can already use existing mechanisms for overload resolution etc., there definitely is no need for actually having two types. Readonliness is just an attribute of a reference. xs0
Mar 08 2006
prev sibling parent reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
 I think having seperate overloads for const and non-const parameters is a 
 design mistake in C++. I see C++ 'const' as a compile-time contract, not a 
 type. IMHO, overloading const vs non-const is like overloading functions 
 based on their contracts.

 I suspect that 'const' has exaggerated importance to those of us from a 
 C++ background, because it's almost the only contract that C++ provides.
 Maybe the solution to 'const' is a more general compile-time contract 
 mechanism.

Let's say we have two declarations: T[] and readonly T[] These two describe two different types - they have two different sets of methods: T[] has opSliceAssign and opIndexAssign. in contrary readonly T[] has no such methods. This is why they are two distinct types. And it is highly desirable that they will be treated as types and not as any sort of contracts. Think about template instantiation, static if's and so on. Andrew Fedoniouk. http://terrainformatica.com
Mar 08 2006
parent reply xs0 <xs0 xs0.com> writes:
Andrew Fedoniouk wrote:
I think having seperate overloads for const and non-const parameters is a 
design mistake in C++. I see C++ 'const' as a compile-time contract, not a 
type. IMHO, overloading const vs non-const is like overloading functions 
based on their contracts.

I suspect that 'const' has exaggerated importance to those of us from a 
C++ background, because it's almost the only contract that C++ provides.
Maybe the solution to 'const' is a more general compile-time contract 
mechanism.

Let's say we have two declarations: T[] and readonly T[] These two describe two different types - they have two different sets of methods: T[] has opSliceAssign and opIndexAssign. in contrary readonly T[] has no such methods. This is why they are two distinct types. And it is highly desirable that they will be treated as types and not as any sort of contracts. Think about template instantiation, static if's and so on.

Is it necessary that there are two distinct types? I feel it would be enough, if there was one type with two sets of methods, one of which could always be called, and the other only through a non-readonly reference. Are there any benefits to having two distinct types? xs0
Mar 08 2006
parent "Andrew Fedoniouk" <news terrainformatica.com> writes:
 Let's say we have two declarations:

 T[] and readonly T[]

 These two describe two different types - they have two different sets of 
 methods:

 T[] has opSliceAssign and opIndexAssign.
 in contrary readonly T[] has no such methods.

 This is why they are two distinct types. And it is highly desirable that 
 they will be treated
 as types and not as any sort of contracts. Think about template 
 instantiation, static if's and so on.

Is it necessary that there are two distinct types? I feel it would be enough, if there was one type with two sets of methods, one of which could always be called, and the other only through a non-readonly reference. Are there any benefits to having two distinct types?

Logically it is one type - region from ptr and up to ptr+length But in this two cases it has different set of operations. You may *interpret* it as different types. readonly is a filter - it filters out all mutators from base type. static if( T[] has(method) opSliceAssign) -> compile-time-true and static if( readonly T[] has(method) opSliceAssign ) -> compile-time-false Andrew Fedoniouk. http://terrainformatica.com
Mar 08 2006
prev sibling parent reply Kevin Bealer <Kevin_member pathlink.com> writes:
In article <dumhec$21bq$1 digitaldaemon.com>, xs0 says...
Kevin Bealer wrote:

OK, this is like the 5000th post I've read regarding const correctness 
and related issues in D. Can we really not come to some kind of an 
agreement on what would be best? I'm sure if there's a consensus about a 
solution, Walter will eventually implement it.

- conceptually, each class definition (say Foo : Bar) produces two new 
types, "readonly Foo : readonly Bar" and "Foo : readonly Foo, Bar", the 
first of which only contains readonly methods, while the second contains 
all others. That makes it trivial to do verification and overloading, 
and has a nice feature that there is no need to actually compile the 
readonly version, as all verification is done statically, so there is no 
increase in code size etc.

This sounds like it has nearly the same properties as what I'm proposing. The main difference seems to be that you can designate methods as only usable in the readonly case. (I'm not sure what that would be useful for, though.) Notice that the inheritance hierarchy described in the paragraph above is the *same* as what I wrote. My point, if you read my post, was that you don't need to introduce new keywords or syntax. If you want or need the division between const/nonconst for a particular type, it can be written in existing D. Kevin
Mar 08 2006
parent Kevin Bealer <Kevin_member pathlink.com> writes:
In article <dun85t$33t$1 digitaldaemon.com>, Kevin Bealer says...
In article <dumhec$21bq$1 digitaldaemon.com>, xs0 says...
Kevin Bealer wrote:

OK, this is like the 5000th post I've read regarding const correctness 
and related issues in D. Can we really not come to some kind of an 
agreement on what would be best? I'm sure if there's a consensus about a 
solution, Walter will eventually implement it.

- conceptually, each class definition (say Foo : Bar) produces two new 
types, "readonly Foo : readonly Bar" and "Foo : readonly Foo, Bar", the 
first of which only contains readonly methods, while the second contains 
all others. That makes it trivial to do verification and overloading, 
and has a nice feature that there is no need to actually compile the 
readonly version, as all verification is done statically, so there is no 
increase in code size etc.

This sounds like it has nearly the same properties as what I'm proposing. The main difference seems to be that you can designate methods as only usable in the readonly case. (I'm not sure what that would be useful for, though.) Notice that the inheritance hierarchy described in the paragraph above is the *same* as what I wrote.

But that part got cut off (sorry bout that), you can find it in the second thread by me with the same title. - Kevin
My point, if you read my post, was that you don't need to introduce new keywords
or syntax.  If you want or need the division between const/nonconst for a
particular type, it can be written in existing D.

Kevin

Mar 08 2006
prev sibling parent reply Kevin Bealer <Kevin_member pathlink.com> writes:
In article <duluq5$11ki$1 digitaldaemon.com>, Kevin Bealer says...
Also, this is not full "C++ const", only parameter passing and const
methods, which seems to be the most popular parts of the const idea.
It seems like it should require more syntax that C++, but it only
takes a small amount.


When working with types like "int", use "in" - const is not too much
of an issue here.

The same is true for struct, it gets copied in, which is fine for
small structs.  For larger structs, you might want to pass by "in *",
i.e. use "in Foo *".  You can modify this technique to use struct, for
that see the last item in the numbered list at the end.


For classes, the issue is that the pointer will not be modified with
the "in" convention, but the values in the class may be.

 // "Problem" code

Oops .. this seems to have been cut off.. apparently a single "." causes this reader to destroy email. I'll post the other 80% of the post again tonight. Kevin
Mar 08 2006
parent reply pragma <pragma_member pathlink.com> writes:
In article <dunjv7$kpj$1 digitaldaemon.com>, Kevin Bealer says...
Oops .. this seems to have been cut off.. apparently a single "." causes this
reader to destroy email.  I'll post the other 80% of the post again tonight.

For what it's worth, the SMTP and NNTP protocols* both use the sequence "\r\n.\r\n" as meaning "end of data". Depending on the quality of your newsreader**, it might not be aware of this and won't filter it out of your post. Having that embedded gives you an unexpectedly truncated message, just like you have here. * - For those not in the know, these are for email (outbound) and usenet respectively ** - Was it the newsreader on digitalmars.com? - EricAnderton at yahoo
Mar 08 2006
parent Kevin Bealer <Kevin_member pathlink.com> writes:
In article <dunl76$m25$1 digitaldaemon.com>, pragma says...
In article <dunjv7$kpj$1 digitaldaemon.com>, Kevin Bealer says...
Oops .. this seems to have been cut off.. apparently a single "." causes this
reader to destroy email.  I'll post the other 80% of the post again tonight.

For what it's worth, the SMTP and NNTP protocols* both use the sequence "\r\n.\r\n" as meaning "end of data". Depending on the quality of your newsreader**, it might not be aware of this and won't filter it out of your post. Having that embedded gives you an unexpectedly truncated message, just like you have here. * - For those not in the know, these are for email (outbound) and usenet respectively ** - Was it the newsreader on digitalmars.com? - EricAnderton at yahoo

Yes it was - this "\n.\n" concept seems vaguely and anciently familiar. I'm posting again with a different "please dont also lose indentation" character. Kevin
Mar 08 2006