www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Container templates and constancy of elements

reply Stewart Gordon <smjg_1998 yahoo.com> writes:
With some containers, such as lists and trees (binary search trees aside), it
doesn't 
matter if the elements can change state, since the data structure remains
well-formed.

However, with others, such as sets and AAs, it's desirable if the keys don't
mutate.  Any 
operations on them won't work right if there are keys in the wrong hash slots
or out of 
order (depending on the underlying data structure) because they have changed
since being 
put into the container.  Of course, this doesn't apply to _values_ in an AA,
since these 
can safely mutate.

In Java and D1, it seems you just have to go on trust if you want your
container to accept 
key types other than strict value types.  But can we do better in D2 - short of
forcing 
the key type to be immutable, which would hinder the container's usefulness?

But it seems D2 has taken one step in that general direction by automatically 
tail-consting the key type of an AA.  But it doesn't stop modifications to the
key through 
mutable references to the data.  And if the key is of class type, you can still
modify the 
object, since D2 conflates constancy of an object reference with constancy of
the object 
itself.  This is in itself silly.  Really, D should have tail-const built in
for stuff 
like this.  OK, so there's Rebindable, but I've found it to be a PITA when
trying to do 
generic programming with it, as well as leading to this AA key anomaly because
it isn't a 
built-in feature.


I suppose there are a few possibilities for what constancy to apply to elements
of a data 
structure to which changes might affect the structure's integrity:

(a) force the type to be tail-const if it's an array, otherwise don't add any
constancy
(b) force the type to be tail-const if it's an array, or fully const if it's a
class
(c) force the type to be tail-immutable if it's an array, otherwise don't add
any constancy
(d) force the type to be tail-immutable if it's an array, or fully immutable if
it's a class
(e) don't add any constancy, but just rely on trust


Currently, AAs implement (a).  (d) guarantees that changes to the data that
mess up the 
data structure will never happen, at least if the programmer doesn't bypass the
const 
system.  (e) is the route D1 and Java are forced to take.  Am I right in
thinking that 
sets and maps in the C++ STL take the same path?


Anyway, what are people's thoughts on the right way to go here?

Stewart.
Mar 14 2012
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Mar 14, 2012 at 06:51:54PM +0000, Stewart Gordon wrote:
 With some containers, such as lists and trees (binary search trees
 aside), it doesn't matter if the elements can change state, since
 the data structure remains well-formed.
 
 However, with others, such as sets and AAs, it's desirable if the
 keys don't mutate.  Any operations on them won't work right if there
 are keys in the wrong hash slots or out of order (depending on the
 underlying data structure) because they have changed since being put
 into the container.  Of course, this doesn't apply to _values_ in an
 AA, since these can safely mutate.

IMO, AA keys should be *implicitly* immutable. That is, when you declare something like: int[ubyte[]] x; then the key type of x should be immutable(ubyte)[], not ubyte[]. Otherwise, it just doesn't make any sense, and causes several of the AA-related bugs currently on the bugtracker.
 In Java and D1, it seems you just have to go on trust if you want
 your container to accept key types other than strict value types.
 But can we do better in D2 - short of forcing the key type to be
 immutable, which would hinder the container's usefulness?
 
 But it seems D2 has taken one step in that general direction by
 automatically tail-consting the key type of an AA.  But it doesn't
 stop modifications to the key through mutable references to the
 data.

Exactly. AA keys must be immutable, no matter what. Of course, to interact nicely with existing code, methods like .opIndex or .get should also accept mutable keys for lookup purposes, and .idup them when a new entry needs to be created.
 And if the key is of class type, you can still modify the object,
 since D2 conflates constancy of an object reference with constancy of
 the object itself.  This is in itself silly.  Really, D should have
 tail-const built in for stuff like this.

This is a major PITA. Especially if you're trying to be const-correct in your code, then it leads to nonsense like being unable to traverse a linked list inside a const method, because the iteration pointer itself is const (whereas it's the *data* it's pointing to that's const) so you can't overwrite the loop variable. However, you *can* recursively search the list. I find this to be a major WAT in D. [...]
 I suppose there are a few possibilities for what constancy to apply
 to elements of a data structure to which changes might affect the
 structure's integrity:
 
 (a) force the type to be tail-const if it's an array, otherwise don't add any
constancy
 (b) force the type to be tail-const if it's an array, or fully const if it's a
class
 (c) force the type to be tail-immutable if it's an array, otherwise don't add
any constancy
 (d) force the type to be tail-immutable if it's an array, or fully immutable
if it's a class
 (e) don't add any constancy, but just rely on trust
 
 
 Currently, AAs implement (a).

Which is prone to bugs.
 (d) guarantees that changes to the data that mess up the data
 structure will never happen, at least if the programmer doesn't bypass
 the const system.

I believe this is the best way to go. Well, at least for AA's this is needed. Otherwise there will always be the possibility that AA's will malfunction when the key changes behind the container's back. T -- Just because you survived after you did it, doesn't mean it wasn't stupid!
Mar 14 2012
next sibling parent =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 03/14/2012 12:24 PM, H. S. Teoh wrote:
 On Wed, Mar 14, 2012 at 06:51:54PM +0000, Stewart Gordon wrote:
 With some containers, such as lists and trees (binary search trees
 aside), it doesn't matter if the elements can change state, since
 the data structure remains well-formed.

 However, with others, such as sets and AAs, it's desirable if the
 keys don't mutate.  Any operations on them won't work right if there
 are keys in the wrong hash slots or out of order (depending on the
 underlying data structure) because they have changed since being put
 into the container.  Of course, this doesn't apply to _values_ in an
 AA, since these can safely mutate.

IMO, AA keys should be *implicitly* immutable. That is, when you declare something like: int[ubyte[]] x; then the key type of x should be immutable(ubyte)[], not ubyte[].

Yes, the internally stored copy of the key should be immutable(ubyte[]). Please note the difference from yours, but I guess that extra protection is just to protect the developers from themselves by avoiding mutating the key in the implementation. But the conceptual key type of the AA should be const(ubyte[]). The reason is, a const parameter is welcoming: It says "I accept both mutable and immutable as arguments". On the other hand, an immutable parameter is restricting: It says "I demand only immutable as argument". The following opIndex accepts keys of five different types of immutability: class SingleElementAA { immutable(uint[]) key; int value; ref int opIndex(const(uint[]) key) { return value; } } void main() { auto aa = new SingleElementAA(); uint[] mutable; immutable(uint)[] element_immutable; immutable(uint[]) all_immutable; enum uint[] enum_immutable = [ 1 ]; aa[mutable] = 42; aa[element_immutable] = 43; aa[all_immutable] = 44; aa[enum_immutable] = 45; aa[[0, 1]] = 42; // literal key } Ali
Mar 14 2012
prev sibling parent Stewart Gordon <smjg_1998 yahoo.com> writes:
On 14/03/2012 19:24, H. S. Teoh wrote:
<snip>
 Exactly. AA keys must be immutable, no matter what. Of course, to
 interact nicely with existing code, methods like .opIndex or .get should
 also accept mutable keys for lookup purposes, and .idup them when a new
 entry needs to be created.

This would rely on class authors making sure they define .idup. We would also need a deep version of .idup for array-of-array and array-of-class types. <snip>
 (d) guarantees that changes to the data that mess up the data
 structure will never happen, at least if the programmer doesn't bypass
 the const system.

I believe this is the best way to go. Well, at least for AA's this is needed. Otherwise there will always be the possibility that AA's will malfunction when the key changes behind the container's back.

Thinking about it, if we're going to go this far, maybe we could just make the key type fully immutable whatever it is. This would enable the key variable in a foreach to be ref for efficiency (especially useful if it's a struct type) and still prevent changing of the key through it. But this precludes implementing hash slots as arrays, at least if you want to be able to delete elements. I might have to rethink my strategy here as well. A linked list would get around it but increase the memory allocation overhead - not sure if this is worth worrying about. Stewart.
Mar 14 2012
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 14 Mar 2012 14:51:54 -0400, Stewart Gordon <smjg_1998 yahoo.com>  
wrote:

 With some containers, such as lists and trees (binary search trees  
 aside), it doesn't matter if the elements can change state, since the  
 data structure remains well-formed.

 However, with others, such as sets and AAs, it's desirable if the keys  
 don't mutate.  Any operations on them won't work right if there are keys  
 in the wrong hash slots or out of order (depending on the underlying  
 data structure) because they have changed since being put into the  
 container.  Of course, this doesn't apply to _values_ in an AA, since  
 these can safely mutate.

 In Java and D1, it seems you just have to go on trust if you want your  
 container to accept key types other than strict value types.  But can we  
 do better in D2 - short of forcing the key type to be immutable, which  
 would hinder the container's usefulness?

 But it seems D2 has taken one step in that general direction by  
 automatically tail-consting the key type of an AA.  But it doesn't stop  
 modifications to the key through mutable references to the data.  And if  
 the key is of class type, you can still modify the object, since D2  
 conflates constancy of an object reference with constancy of the object  
 itself.  This is in itself silly.  Really, D should have tail-const  
 built in for stuff like this.  OK, so there's Rebindable, but I've found  
 it to be a PITA when trying to do generic programming with it, as well  
 as leading to this AA key anomaly because it isn't a built-in feature.


 I suppose there are a few possibilities for what constancy to apply to  
 elements of a data structure to which changes might affect the  
 structure's integrity:

 (a) force the type to be tail-const if it's an array, otherwise don't  
 add any constancy
 (b) force the type to be tail-const if it's an array, or fully const if  
 it's a class
 (c) force the type to be tail-immutable if it's an array, otherwise  
 don't add any constancy
 (d) force the type to be tail-immutable if it's an array, or fully  
 immutable if it's a class
 (e) don't add any constancy, but just rely on trust


 Currently, AAs implement (a).  (d) guarantees that changes to the data  
 that mess up the data structure will never happen, at least if the  
 programmer doesn't bypass the const system.  (e) is the route D1 and  
 Java are forced to take.  Am I right in thinking that sets and maps in  
 the C++ STL take the same path?


 Anyway, what are people's thoughts on the right way to go here?

IMO, safe code should only allow d, system/trusted should allow e. And the compiler shouldn't be "helping" you by adding const annotations. That is: int[char[]] aa; this line should either compile, and the type should be int[char[]] aa, or it should not compile. D is supposed to be able to do "at your own risk" bare-metal types of things. This seems like it unnecessarily limits code. -Steve
Mar 14 2012