www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - associative arrays: iteration is finally here

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Walter has magically converted his work on T[new] into work on making 
associative arrays true templates defined in druntime and not considered 
very special by the compiler.

This is very exciting because it opens up or simplifies a number of 
possibilities. One is that of implementing true iteration. I actually 
managed to implement last night something that allows you to do:

int[int] aa = [ 1:1 ];
auto iter = aa.each;
writeln(iter.front.key);
writeln(iter.front.value);

Two other iterations are possible: by key and by value (in those cases 
iter.front just returns a key or a value).

One question is, what names should these bear? I am thinking of makign 
opSlice() a universal method of getting the "all" iterator, a default 
that every container must implement.

For AAs, there would be a "iterate keys" and "iterate values" properties 
or functions. How should they be called?


Thanks,

Andrei
Oct 28 2009
next sibling parent reply =?ISO-8859-1?Q?Pelle_M=E5nsson?= <pelle.mansson gmail.com> writes:
Andrei Alexandrescu wrote:
 Walter has magically converted his work on T[new] into work on making 
 associative arrays true templates defined in druntime and not considered 
 very special by the compiler.
 
 This is very exciting because it opens up or simplifies a number of 
 possibilities. One is that of implementing true iteration. I actually 
 managed to implement last night something that allows you to do:
 
 int[int] aa = [ 1:1 ];
 auto iter = aa.each;
 writeln(iter.front.key);
 writeln(iter.front.value);
 
 Two other iterations are possible: by key and by value (in those cases 
 iter.front just returns a key or a value).
 
 One question is, what names should these bear? I am thinking of makign 
 opSlice() a universal method of getting the "all" iterator, a default 
 that every container must implement.
 
 For AAs, there would be a "iterate keys" and "iterate values" properties 
 or functions. How should they be called?
 
 
 Thanks,
 
 Andrei

Also, foreach with a single variable should default to keys, in my opinion.
Oct 28 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Pelle Månsson wrote:
 Andrei Alexandrescu wrote:
 Walter has magically converted his work on T[new] into work on making 
 associative arrays true templates defined in druntime and not 
 considered very special by the compiler.

 This is very exciting because it opens up or simplifies a number of 
 possibilities. One is that of implementing true iteration. I actually 
 managed to implement last night something that allows you to do:

 int[int] aa = [ 1:1 ];
 auto iter = aa.each;
 writeln(iter.front.key);
 writeln(iter.front.value);

 Two other iterations are possible: by key and by value (in those cases 
 iter.front just returns a key or a value).

 One question is, what names should these bear? I am thinking of makign 
 opSlice() a universal method of getting the "all" iterator, a default 
 that every container must implement.

 For AAs, there would be a "iterate keys" and "iterate values" 
 properties or functions. How should they be called?


 Thanks,

 Andrei


The latter two would break existing definitions of keys and values.
 Also, foreach with a single variable should default to keys, in my opinion.

That is debatable as it would make the same code do different things for e.g. vectors and sparse vectors. Andrei
Oct 28 2009
parent reply =?ISO-8859-1?Q?Pelle_M=E5nsson?= <pelle.mansson gmail.com> writes:
Andrei Alexandrescu wrote:
 Pelle Månsson wrote:
 Andrei Alexandrescu wrote:
 Walter has magically converted his work on T[new] into work on making 
 associative arrays true templates defined in druntime and not 
 considered very special by the compiler.

 This is very exciting because it opens up or simplifies a number of 
 possibilities. One is that of implementing true iteration. I actually 
 managed to implement last night something that allows you to do:

 int[int] aa = [ 1:1 ];
 auto iter = aa.each;
 writeln(iter.front.key);
 writeln(iter.front.value);

 Two other iterations are possible: by key and by value (in those 
 cases iter.front just returns a key or a value).

 One question is, what names should these bear? I am thinking of 
 makign opSlice() a universal method of getting the "all" iterator, a 
 default that every container must implement.

 For AAs, there would be a "iterate keys" and "iterate values" 
 properties or functions. How should they be called?


 Thanks,

 Andrei


The latter two would break existing definitions of keys and values.

from the iterator.
 Also, foreach with a single variable should default to keys, in my 
 opinion.

That is debatable as it would make the same code do different things for e.g. vectors and sparse vectors. Andrei

Debatable indeed, but I find myself using either just the keys or the keys and values together, rarely just the values. Maybe that's just me.
Oct 28 2009
parent reply "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
Pelle Månsson wrote:
 Andrei Alexandrescu wrote:
 Pelle Månsson wrote:
 Also, foreach with a single variable should default to keys, in my 
 opinion.

That is debatable as it would make the same code do different things for e.g. vectors and sparse vectors. Andrei

Debatable indeed, but I find myself using either just the keys or the keys and values together, rarely just the values. Maybe that's just me.

I've used iteration over values more often than iteration over keys. Besides, I think consistency is important. Since the default for an ordinary array is to iterate over the values, it should be the same for associative arrays. -Lars
Oct 28 2009
parent reply =?ISO-8859-1?Q?Pelle_M=E5nsson?= <pelle.mansson gmail.com> writes:
Lars T. Kyllingstad wrote:
 Pelle Månsson wrote:
 Andrei Alexandrescu wrote:
 Pelle Månsson wrote:
 Also, foreach with a single variable should default to keys, in my 
 opinion.

That is debatable as it would make the same code do different things for e.g. vectors and sparse vectors. Andrei

Debatable indeed, but I find myself using either just the keys or the keys and values together, rarely just the values. Maybe that's just me.

I've used iteration over values more often than iteration over keys. Besides, I think consistency is important. Since the default for an ordinary array is to iterate over the values, it should be the same for associative arrays. -Lars

If you do, shouldn't you be using a regular array? Actually, it doesn't matter all that much, as long as we get .keys and .values as alternatives.
Oct 28 2009
parent reply "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
Pelle Månsson wrote:
 Lars T. Kyllingstad wrote:
 Pelle Månsson wrote:
 Andrei Alexandrescu wrote:
 Pelle Månsson wrote:
 Also, foreach with a single variable should default to keys, in my 
 opinion.

That is debatable as it would make the same code do different things for e.g. vectors and sparse vectors. Andrei

Debatable indeed, but I find myself using either just the keys or the keys and values together, rarely just the values. Maybe that's just me.

I've used iteration over values more often than iteration over keys. Besides, I think consistency is important. Since the default for an ordinary array is to iterate over the values, it should be the same for associative arrays. -Lars

If you do, shouldn't you be using a regular array?

Here's an example: class SomeObject { ... } void doStuffWith(SomeObject s) { ... } void doOtherStuffWith(SomeObject s) { ... } // Make a collection of objects indexed by ID strings. SomeObject[string] myObjects; ... // First I just want to do something with one of the // objects, namely the one called "foo". doStuffWith(myObjects["foo"]); // Then, I want to do something with all the objects. foreach (obj; myObjects) doOtherStuffWith(obj); Of course, if iteration was over keys instead of values, I'd just write foreach (id, obj; myObjects) doOtherStuffWith(obj); But then again, right now, when iteration is over values and I want the keys I can just write the same thing. It all comes down to preference, and I prefer things the way they are now. :)
 Actually, it doesn't matter all that much, as long as we get .keys and 
 .values as alternatives.

I still think the default for foreach should be consistent with normal arrays. -Lars
Oct 28 2009
parent reply =?ISO-8859-1?Q?Pelle_M=E5nsson?= <pelle.mansson gmail.com> writes:
Lars T. Kyllingstad wrote:
 Pelle Månsson wrote:
 Lars T. Kyllingstad wrote:
 Pelle Månsson wrote:
 Andrei Alexandrescu wrote:
 Pelle Månsson wrote:
 Also, foreach with a single variable should default to keys, in my 
 opinion.

That is debatable as it would make the same code do different things for e.g. vectors and sparse vectors. Andrei

Debatable indeed, but I find myself using either just the keys or the keys and values together, rarely just the values. Maybe that's just me.

I've used iteration over values more often than iteration over keys. Besides, I think consistency is important. Since the default for an ordinary array is to iterate over the values, it should be the same for associative arrays. -Lars

If you do, shouldn't you be using a regular array?

Here's an example: class SomeObject { ... } void doStuffWith(SomeObject s) { ... } void doOtherStuffWith(SomeObject s) { ... } // Make a collection of objects indexed by ID strings. SomeObject[string] myObjects; ... // First I just want to do something with one of the // objects, namely the one called "foo". doStuffWith(myObjects["foo"]); // Then, I want to do something with all the objects. foreach (obj; myObjects) doOtherStuffWith(obj); Of course, if iteration was over keys instead of values, I'd just write foreach (id, obj; myObjects) doOtherStuffWith(obj); But then again, right now, when iteration is over values and I want the keys I can just write the same thing. It all comes down to preference, and I prefer things the way they are now. :)
 Actually, it doesn't matter all that much, as long as we get .keys and 
 .values as alternatives.

I still think the default for foreach should be consistent with normal arrays. -Lars

if (foo in aa) { //it is in the aa. foreach (f; aa) { // loop over each item in the aa //I expect foo to show up in here, since it is "in" the aa. } } I use key iteration more than I use value iteration, and it is what I am used to. It is, as you say, a matter of preference.
Oct 28 2009
parent reply "Nick Sabalausky" <a a.a> writes:
"Pelle Månsson" <pelle.mansson gmail.com> wrote in message 
news:hcaaro$15e4$1 digitalmars.com...
 I think foreach should be consistent with opIn, that is,
 if (foo in aa) { //it is in the aa.
   foreach (f; aa) { // loop over each item in the aa
     //I expect foo to show up in here, since it is "in" the aa.
   }
 }

 I use key iteration more than I use value iteration, and it is what I am 
 used to. It is, as you say, a matter of preference.

I've thought for a long while that "in" should be value-based (so you can do things like "if(foo in [1,2,7,9])" instead of the not-as-nice "if([1,2,7,9].contains(foo))"), and that there should be some other way to check for the existance of a key (like "aa.hasKey(key)" or "key in aa.keys", or something like that). I need to check for values in an array much more often than I need to check for keys in an aa.
Oct 29 2009
parent =?ISO-8859-1?Q?Pelle_M=E5nsson?= <pelle.mansson gmail.com> writes:
Nick Sabalausky wrote:
 "Pelle Månsson" <pelle.mansson gmail.com> wrote in message 
 news:hcaaro$15e4$1 digitalmars.com...
 I think foreach should be consistent with opIn, that is,
 if (foo in aa) { //it is in the aa.
   foreach (f; aa) { // loop over each item in the aa
     //I expect foo to show up in here, since it is "in" the aa.
   }
 }

 I use key iteration more than I use value iteration, and it is what I am 
 used to. It is, as you say, a matter of preference.

I've thought for a long while that "in" should be value-based (so you can do things like "if(foo in [1,2,7,9])" instead of the not-as-nice "if([1,2,7,9].contains(foo))"), and that there should be some other way to check for the existance of a key (like "aa.hasKey(key)" or "key in aa.keys", or something like that). I need to check for values in an array much more often than I need to check for keys in an aa.

I, too, want opIn to work on arrays. On values. As a linear search. I do not see why you would want to remove it on AA keys, though.
Oct 30 2009
prev sibling next sibling parent Justin Johansson <no spam.com> writes:
Andrei Alexandrescu Wrote:

 Walter has magically converted his work on T[new] into work on making 
 associative arrays true templates defined in druntime and not considered 
 very special by the compiler.
 
 This is very exciting because it opens up or simplifies a number of 
 possibilities. One is that of implementing true iteration. I actually 
 managed to implement last night something that allows you to do:
 
 int[int] aa = [ 1:1 ];
 auto iter = aa.each;
 writeln(iter.front.key);
 writeln(iter.front.value);
 
 Two other iterations are possible: by key and by value (in those cases 
 iter.front just returns a key or a value).
 
 One question is, what names should these bear? I am thinking of makign 
 opSlice() a universal method of getting the "all" iterator, a default 
 that every container must implement.
 
 For AAs, there would be a "iterate keys" and "iterate values" properties 
 or functions. How should they be called?
 
 
 Thanks,
 
 Andrei

Don't know if I'm off track but ... would "opApplyKeys" and "opApplyValues" be a useful analogy in somewhat keeping with the current semantics of opApply? Justin
Oct 28 2009
prev sibling next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 Walter has magically converted his work on T[new] into work on making
 associative arrays true templates defined in druntime and not considered
 very special by the compiler.
 This is very exciting because it opens up or simplifies a number of
 possibilities. One is that of implementing true iteration. I actually
 managed to implement last night something that allows you to do:
 int[int] aa = [ 1:1 ];
 auto iter = aa.each;
 writeln(iter.front.key);
 writeln(iter.front.value);
 Two other iterations are possible: by key and by value (in those cases
 iter.front just returns a key or a value).
 One question is, what names should these bear? I am thinking of makign
 opSlice() a universal method of getting the "all" iterator, a default
 that every container must implement.
 For AAs, there would be a "iterate keys" and "iterate values" properties
 or functions. How should they be called?
 Thanks,
 Andrei

Awesome, this definitely improves the interface, but how about the implementation? The current implementation, while fast for reading, is unbelievably slow for adding elements, requires a heap allocation (read: a global lock) on *every* insertion, and generates an insane amount of false pointers. Even if I succeed in making heap scanning (mostly) precise, it's not clear if the current AA impl. could easily be made to benefit, since it isn't template based. It uses RTTI internally instead, and the types it's operating on aren't known to the implementation at compile time, so I wouldn't be able to use templates to generate the bitmask at compile time. The structs it uses internally would therefor have to be scanned conservatively.
Oct 28 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
dsimcha wrote:
 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 Walter has magically converted his work on T[new] into work on making
 associative arrays true templates defined in druntime and not considered
 very special by the compiler.
 This is very exciting because it opens up or simplifies a number of
 possibilities. One is that of implementing true iteration. I actually
 managed to implement last night something that allows you to do:
 int[int] aa = [ 1:1 ];
 auto iter = aa.each;
 writeln(iter.front.key);
 writeln(iter.front.value);
 Two other iterations are possible: by key and by value (in those cases
 iter.front just returns a key or a value).
 One question is, what names should these bear? I am thinking of makign
 opSlice() a universal method of getting the "all" iterator, a default
 that every container must implement.
 For AAs, there would be a "iterate keys" and "iterate values" properties
 or functions. How should they be called?
 Thanks,
 Andrei

Awesome, this definitely improves the interface, but how about the implementation? The current implementation, while fast for reading, is unbelievably slow for adding elements, requires a heap allocation (read: a global lock) on *every* insertion, and generates an insane amount of false pointers. Even if I succeed in making heap scanning (mostly) precise, it's not clear if the current AA impl. could easily be made to benefit, since it isn't template based. It uses RTTI internally instead, and the types it's operating on aren't known to the implementation at compile time, so I wouldn't be able to use templates to generate the bitmask at compile time. The structs it uses internally would therefor have to be scanned conservatively.

I'm afraid that efficiency is a matter I need to defer to the community for now. Right now, I am trying to get TDPL done. Having or not having range-style iteration influences the material. Making that efficient is a matter that would not influence the material (as long as there is a strong belief that that's doable). Unrelated: one thing that we need to change about AAs is the inability to get a true reference to the stored element. aa[k] returns an rvalue, and a[k] = v is done in a manner akin to opIndexAssign. But a serious AA should have a method of reaching the actual storage for a value, I think. Andrei
Oct 28 2009
parent reply "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
Andrei Alexandrescu wrote:
 dsimcha wrote:
 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s 
 article
 Walter has magically converted his work on T[new] into work on making
 associative arrays true templates defined in druntime and not considered
 very special by the compiler.
 This is very exciting because it opens up or simplifies a number of
 possibilities. One is that of implementing true iteration. I actually
 managed to implement last night something that allows you to do:
 int[int] aa = [ 1:1 ];
 auto iter = aa.each;
 writeln(iter.front.key);
 writeln(iter.front.value);
 Two other iterations are possible: by key and by value (in those cases
 iter.front just returns a key or a value).
 One question is, what names should these bear? I am thinking of makign
 opSlice() a universal method of getting the "all" iterator, a default
 that every container must implement.
 For AAs, there would be a "iterate keys" and "iterate values" properties
 or functions. How should they be called?
 Thanks,
 Andrei

Awesome, this definitely improves the interface, but how about the implementation? The current implementation, while fast for reading, is unbelievably slow for adding elements, requires a heap allocation (read: a global lock) on *every* insertion, and generates an insane amount of false pointers. Even if I succeed in making heap scanning (mostly) precise, it's not clear if the current AA impl. could easily be made to benefit, since it isn't template based. It uses RTTI internally instead, and the types it's operating on aren't known to the implementation at compile time, so I wouldn't be able to use templates to generate the bitmask at compile time. The structs it uses internally would therefor have to be scanned conservatively.

I'm afraid that efficiency is a matter I need to defer to the community for now. Right now, I am trying to get TDPL done. Having or not having range-style iteration influences the material. Making that efficient is a matter that would not influence the material (as long as there is a strong belief that that's doable). Unrelated: one thing that we need to change about AAs is the inability to get a true reference to the stored element. aa[k] returns an rvalue, and a[k] = v is done in a manner akin to opIndexAssign. But a serious AA should have a method of reaching the actual storage for a value, I think.

Isn't that what the in operator does? T[U] aa; U key = something; T* p = key in aa; -Lars
Oct 28 2009
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Lars T. Kyllingstad wrote:
 Andrei Alexandrescu wrote:
 dsimcha wrote:
 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s 
 article
 Walter has magically converted his work on T[new] into work on making
 associative arrays true templates defined in druntime and not 
 considered
 very special by the compiler.
 This is very exciting because it opens up or simplifies a number of
 possibilities. One is that of implementing true iteration. I actually
 managed to implement last night something that allows you to do:
 int[int] aa = [ 1:1 ];
 auto iter = aa.each;
 writeln(iter.front.key);
 writeln(iter.front.value);
 Two other iterations are possible: by key and by value (in those cases
 iter.front just returns a key or a value).
 One question is, what names should these bear? I am thinking of makign
 opSlice() a universal method of getting the "all" iterator, a default
 that every container must implement.
 For AAs, there would be a "iterate keys" and "iterate values" 
 properties
 or functions. How should they be called?
 Thanks,
 Andrei

Awesome, this definitely improves the interface, but how about the implementation? The current implementation, while fast for reading, is unbelievably slow for adding elements, requires a heap allocation (read: a global lock) on *every* insertion, and generates an insane amount of false pointers. Even if I succeed in making heap scanning (mostly) precise, it's not clear if the current AA impl. could easily be made to benefit, since it isn't template based. It uses RTTI internally instead, and the types it's operating on aren't known to the implementation at compile time, so I wouldn't be able to use templates to generate the bitmask at compile time. The structs it uses internally would therefor have to be scanned conservatively.

I'm afraid that efficiency is a matter I need to defer to the community for now. Right now, I am trying to get TDPL done. Having or not having range-style iteration influences the material. Making that efficient is a matter that would not influence the material (as long as there is a strong belief that that's doable). Unrelated: one thing that we need to change about AAs is the inability to get a true reference to the stored element. aa[k] returns an rvalue, and a[k] = v is done in a manner akin to opIndexAssign. But a serious AA should have a method of reaching the actual storage for a value, I think.

Isn't that what the in operator does? T[U] aa; U key = something; T* p = key in aa; -Lars

Correct, sorry for the oversight. Andrei
Oct 28 2009
prev sibling next sibling parent Max Samukha <spambox d-coding.com> writes:
On Wed, 28 Oct 2009 09:22:00 -0500, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

Walter has magically converted his work on T[new] into work on making 
associative arrays true templates defined in druntime and not considered 
very special by the compiler.

This is very exciting because it opens up or simplifies a number of 
possibilities. One is that of implementing true iteration. I actually 
managed to implement last night something that allows you to do:

int[int] aa = [ 1:1 ];
auto iter = aa.each;
writeln(iter.front.key);
writeln(iter.front.value);

Two other iterations are possible: by key and by value (in those cases 
iter.front just returns a key or a value).

One question is, what names should these bear? I am thinking of makign 
opSlice() a universal method of getting the "all" iterator, a default 
that every container must implement.

It looks pretty intuitive to me.
For AAs, there would be a "iterate keys" and "iterate values" properties 
or functions. How should they be called?

eachKey, eachValue keyRange, valueRange keys, values I'd prefer the last one.
Thanks,

Andrei

Oct 28 2009
prev sibling next sibling parent reply yigal chripun <yigal100 gmail.com> writes:
Andrei Alexandrescu Wrote:

 Walter has magically converted his work on T[new] into work on making 
 associative arrays true templates defined in druntime and not considered 
 very special by the compiler.
 
 This is very exciting because it opens up or simplifies a number of 
 possibilities. One is that of implementing true iteration. I actually 
 managed to implement last night something that allows you to do:
 
 int[int] aa = [ 1:1 ];
 auto iter = aa.each;
 writeln(iter.front.key);
 writeln(iter.front.value);
 
 Two other iterations are possible: by key and by value (in those cases 
 iter.front just returns a key or a value).
 
 One question is, what names should these bear? I am thinking of makign 
 opSlice() a universal method of getting the "all" iterator, a default 
 that every container must implement.
 
 For AAs, there would be a "iterate keys" and "iterate values" properties 
 or functions. How should they be called?
 
 
 Thanks,
 
 Andrei

that looks neat. What's the mechanism to tie the templates to the syntax? I don't understand why all containers must provide a default range. What's the default for a binary tree? it has several orderings (pre, post, infix) but i can't say that one is "more default" the the other.
Oct 28 2009
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
yigal chripun wrote:
 Andrei Alexandrescu Wrote:
 
 Walter has magically converted his work on T[new] into work on making 
 associative arrays true templates defined in druntime and not considered 
 very special by the compiler.

 This is very exciting because it opens up or simplifies a number of 
 possibilities. One is that of implementing true iteration. I actually 
 managed to implement last night something that allows you to do:

 int[int] aa = [ 1:1 ];
 auto iter = aa.each;
 writeln(iter.front.key);
 writeln(iter.front.value);

 Two other iterations are possible: by key and by value (in those cases 
 iter.front just returns a key or a value).

 One question is, what names should these bear? I am thinking of makign 
 opSlice() a universal method of getting the "all" iterator, a default 
 that every container must implement.

 For AAs, there would be a "iterate keys" and "iterate values" properties 
 or functions. How should they be called?


 Thanks,

 Andrei

that looks neat. What's the mechanism to tie the templates to the syntax? I don't understand why all containers must provide a default range. What's the default for a binary tree? it has several orderings (pre, post, infix) but i can't say that one is "more default" the the other.

The cheapest to implement. As I mentioned in the Bogo containers discussion, I think any container must implement some way of iterating it. A container without an iterator is not a container. Andrei
Oct 28 2009
prev sibling next sibling parent Leandro Lucarella <llucax gmail.com> writes:
Pelle MÃ¥nsson, el 28 de octubre a las 15:48 me escribiste:
 Andrei Alexandrescu wrote:
Walter has magically converted his work on T[new] into work on
making associative arrays true templates defined in druntime and
not considered very special by the compiler.

This is very exciting because it opens up or simplifies a number
of possibilities. One is that of implementing true iteration. I
actually managed to implement last night something that allows you
to do:

int[int] aa = [ 1:1 ];
auto iter = aa.each;
writeln(iter.front.key);
writeln(iter.front.value);

Two other iterations are possible: by key and by value (in those
cases iter.front just returns a key or a value).

One question is, what names should these bear? I am thinking of
makign opSlice() a universal method of getting the "all" iterator,
a default that every container must implement.

For AAs, there would be a "iterate keys" and "iterate values"
properties or functions. How should they be called?


Thanks,

Andrei


I might be too pythonic, but aa.items sounds a little better for me ;) -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- Can you stand up? I do believe it's working, good. That'll keep you going through the show Come on it's time to go.
Oct 28 2009
prev sibling next sibling parent reply "Denis Koroskin" <2korden gmail.com> writes:
On Wed, 28 Oct 2009 17:22:00 +0300, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Walter has magically converted his work on T[new] into work on making  
 associative arrays true templates defined in druntime and not considered  
 very special by the compiler.

Wow, this is outstanding! (I hope it didn't have any negative impact on compile-time AA capabilities).
 This is very exciting because it opens up or simplifies a number of  
 possibilities. One is that of implementing true iteration. I actually  
 managed to implement last night something that allows you to do:

 int[int] aa = [ 1:1 ];
 auto iter = aa.each;
 writeln(iter.front.key);
 writeln(iter.front.value);

 Two other iterations are possible: by key and by value (in those cases  
 iter.front just returns a key or a value).

 One question is, what names should these bear? I am thinking of makign  
 opSlice() a universal method of getting the "all" iterator, a default  
 that every container must implement.

 For AAs, there would be a "iterate keys" and "iterate values" properties  
 or functions. How should they be called?


 Thanks,

 Andrei

If AA is providing a way to iterate over both keys and values (and it's a default iteration scheme), why should AA provide 2 other iteration schemes? Can't they be implemented externally (using adaptor ranges) with the same efficiency? foreach (e; keys(aa)) { writefln("key: %s", e); } foreach (e; values(aa)) { writefln("value: %s", e); } I'd also like you to add a few things in an AA interface. First, opIn should not return a pointer to Value, but a pointer to a pair of Key and Value, if possible (i.e. if this change won't sacrifice performance). Second, AA.remove method should accept result of opIn operation to avoid an additional lookup for removal: if (auto value = key in aa) { aa.remove(key); // an unnecessary lookup } Something like this would be perfect: struct Element(K,V) { const K key; V value; } struct AA(K,V) { //... ref Element opIn(K key) { /* throws an exception if element is not found */ } void remove(ref Element elem) { /* removes an element from an AA */ } void remove(K key) { remove(key in this); } AARange!(K,V) opSlice() { /* iterates over both keys and values */ } } Last, I believe foreach loop should automatically call opSlice() on iteratee. There is currently an inconsistency with built-in types - you don't have to call [] on them, yet you must call it on all the other types: // fine if array is T[] or K[V] foreach (i; array) { ... } // opSlice() is explicit and mandatory for user-defined containers because they are not ranges. foreach (i; container[]) { ... } Thanks!
Oct 28 2009
next sibling parent reply "Robert Jacques" <sandford jhu.edu> writes:
On Wed, 28 Oct 2009 15:06:34 -0400, Denis Koroskin <2korden gmail.com>  
wrote:

 On Wed, 28 Oct 2009 17:22:00 +0300, Andrei Alexandrescu  
 <SeeWebsiteForEmail erdani.org> wrote:

 Walter has magically converted his work on T[new] into work on making  
 associative arrays true templates defined in druntime and not  
 considered very special by the compiler.

Wow, this is outstanding! (I hope it didn't have any negative impact on compile-time AA capabilities).
 This is very exciting because it opens up or simplifies a number of  
 possibilities. One is that of implementing true iteration. I actually  
 managed to implement last night something that allows you to do:

 int[int] aa = [ 1:1 ];
 auto iter = aa.each;
 writeln(iter.front.key);
 writeln(iter.front.value);

 Two other iterations are possible: by key and by value (in those cases  
 iter.front just returns a key or a value).

 One question is, what names should these bear? I am thinking of makign  
 opSlice() a universal method of getting the "all" iterator, a default  
 that every container must implement.

 For AAs, there would be a "iterate keys" and "iterate values"  
 properties or functions. How should they be called?


 Thanks,

 Andrei

If AA is providing a way to iterate over both keys and values (and it's a default iteration scheme), why should AA provide 2 other iteration schemes? Can't they be implemented externally (using adaptor ranges) with the same efficiency? foreach (e; keys(aa)) { writefln("key: %s", e); } foreach (e; values(aa)) { writefln("value: %s", e); } I'd also like you to add a few things in an AA interface. First, opIn should not return a pointer to Value, but a pointer to a pair of Key and Value, if possible (i.e. if this change won't sacrifice performance). Second, AA.remove method should accept result of opIn operation to avoid an additional lookup for removal: if (auto value = key in aa) { aa.remove(key); // an unnecessary lookup } Something like this would be perfect: struct Element(K,V) { const K key; V value; } struct AA(K,V) { //... ref Element opIn(K key) { /* throws an exception if element is not found */ }

Not finding an element is a common use case, not an exception. Using exceptions to pass information is bad style, slow and prevents the use of AAs in pure/nothrow functions. Returning a pointer to an element would allow both key and value to be accessed and could be null if no element is found.
      void remove(ref Element elem) { /* removes an element from an AA */  
 }
      void remove(K key) { remove(key in this); }

      AARange!(K,V) opSlice() { /* iterates over both keys and values */ }
 }

 Last, I believe foreach loop should automatically call opSlice() on  
 iteratee. There is currently an inconsistency with built-in types - you  
 don't have to call [] on them, yet you must call it on all the other  
 types:

 // fine if array is T[] or K[V]
 foreach (i; array) { ... }

 // opSlice() is explicit and mandatory for user-defined containers  
 because they are not ranges.
 foreach (i; container[]) { ... }

 Thanks!

Oct 28 2009
parent =?ISO-8859-15?Q?Pelle_M=E5nsson?= <pelle.mansson gmail.com> writes:
Robert Jacques wrote:
 On Wed, 28 Oct 2009 15:06:34 -0400, Denis Koroskin <2korden gmail.com> 
 wrote:
 
 On Wed, 28 Oct 2009 17:22:00 +0300, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:

 Walter has magically converted his work on T[new] into work on making 
 associative arrays true templates defined in druntime and not 
 considered very special by the compiler.

Wow, this is outstanding! (I hope it didn't have any negative impact on compile-time AA capabilities).
 This is very exciting because it opens up or simplifies a number of 
 possibilities. One is that of implementing true iteration. I actually 
 managed to implement last night something that allows you to do:

 int[int] aa = [ 1:1 ];
 auto iter = aa.each;
 writeln(iter.front.key);
 writeln(iter.front.value);

 Two other iterations are possible: by key and by value (in those 
 cases iter.front just returns a key or a value).

 One question is, what names should these bear? I am thinking of 
 makign opSlice() a universal method of getting the "all" iterator, a 
 default that every container must implement.

 For AAs, there would be a "iterate keys" and "iterate values" 
 properties or functions. How should they be called?


 Thanks,

 Andrei

If AA is providing a way to iterate over both keys and values (and it's a default iteration scheme), why should AA provide 2 other iteration schemes? Can't they be implemented externally (using adaptor ranges) with the same efficiency? foreach (e; keys(aa)) { writefln("key: %s", e); } foreach (e; values(aa)) { writefln("value: %s", e); } I'd also like you to add a few things in an AA interface. First, opIn should not return a pointer to Value, but a pointer to a pair of Key and Value, if possible (i.e. if this change won't sacrifice performance). Second, AA.remove method should accept result of opIn operation to avoid an additional lookup for removal: if (auto value = key in aa) { aa.remove(key); // an unnecessary lookup } Something like this would be perfect: struct Element(K,V) { const K key; V value; } struct AA(K,V) { //... ref Element opIn(K key) { /* throws an exception if element is not found */ }

Not finding an element is a common use case, not an exception. Using exceptions to pass information is bad style, slow and prevents the use of AAs in pure/nothrow functions. Returning a pointer to an element would allow both key and value to be accessed and could be null if no element is found.

Also, if opIn throws an exception, it kind of defeats the point of opIn, and turns it to opIndex.
Oct 28 2009
prev sibling next sibling parent =?UTF-8?B?UGVsbGUgTcOlbnNzb24=?= <pelle.mansson gmail.com> writes:
Denis Koroskin wrote:
 On Wed, 28 Oct 2009 17:22:00 +0300, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:
 
 Walter has magically converted his work on T[new] into work on making 
 associative arrays true templates defined in druntime and not 
 considered very special by the compiler.

Wow, this is outstanding! (I hope it didn't have any negative impact on compile-time AA capabilities).
 This is very exciting because it opens up or simplifies a number of 
 possibilities. One is that of implementing true iteration. I actually 
 managed to implement last night something that allows you to do:

 int[int] aa = [ 1:1 ];
 auto iter = aa.each;
 writeln(iter.front.key);
 writeln(iter.front.value);

 Two other iterations are possible: by key and by value (in those cases 
 iter.front just returns a key or a value).

 One question is, what names should these bear? I am thinking of makign 
 opSlice() a universal method of getting the "all" iterator, a default 
 that every container must implement.

 For AAs, there would be a "iterate keys" and "iterate values" 
 properties or functions. How should they be called?


 Thanks,

 Andrei

If AA is providing a way to iterate over both keys and values (and it's a default iteration scheme), why should AA provide 2 other iteration schemes? Can't they be implemented externally (using adaptor ranges) with the same efficiency? foreach (e; keys(aa)) { writefln("key: %s", e); } foreach (e; values(aa)) { writefln("value: %s", e); }

Why would you prefer keys(aa) over aa.keys?
 Last, I believe foreach loop should automatically call opSlice() on 
 iteratee. There is currently an inconsistency with built-in types - you 
 don't have to call [] on them, yet you must call it on all the other types:

Try implementing the range interface (front, popFront and empty), and they are ranges. Magic! opApply is worth mentioning here, as well.
Oct 28 2009
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Wed, 28 Oct 2009 22:24:46 +0300, Robert Jacques <sandford jhu.edu>  
wrote:

 On Wed, 28 Oct 2009 15:06:34 -0400, Denis Koroskin <2korden gmail.com>  
 wrote:

 On Wed, 28 Oct 2009 17:22:00 +0300, Andrei Alexandrescu  
 <SeeWebsiteForEmail erdani.org> wrote:

 Walter has magically converted his work on T[new] into work on making  
 associative arrays true templates defined in druntime and not  
 considered very special by the compiler.

Wow, this is outstanding! (I hope it didn't have any negative impact on compile-time AA capabilities).
 This is very exciting because it opens up or simplifies a number of  
 possibilities. One is that of implementing true iteration. I actually  
 managed to implement last night something that allows you to do:

 int[int] aa = [ 1:1 ];
 auto iter = aa.each;
 writeln(iter.front.key);
 writeln(iter.front.value);

 Two other iterations are possible: by key and by value (in those cases  
 iter.front just returns a key or a value).

 One question is, what names should these bear? I am thinking of makign  
 opSlice() a universal method of getting the "all" iterator, a default  
 that every container must implement.

 For AAs, there would be a "iterate keys" and "iterate values"  
 properties or functions. How should they be called?


 Thanks,

 Andrei

If AA is providing a way to iterate over both keys and values (and it's a default iteration scheme), why should AA provide 2 other iteration schemes? Can't they be implemented externally (using adaptor ranges) with the same efficiency? foreach (e; keys(aa)) { writefln("key: %s", e); } foreach (e; values(aa)) { writefln("value: %s", e); } I'd also like you to add a few things in an AA interface. First, opIn should not return a pointer to Value, but a pointer to a pair of Key and Value, if possible (i.e. if this change won't sacrifice performance). Second, AA.remove method should accept result of opIn operation to avoid an additional lookup for removal: if (auto value = key in aa) { aa.remove(key); // an unnecessary lookup } Something like this would be perfect: struct Element(K,V) { const K key; V value; } struct AA(K,V) { //... ref Element opIn(K key) { /* throws an exception if element is not found */ }

Not finding an element is a common use case, not an exception. Using exceptions to pass information is bad style, slow and prevents the use of AAs in pure/nothrow functions. Returning a pointer to an element would allow both key and value to be accessed and could be null if no element is found.

Ooops, right, I first wrote it to return a pointer but changed to a reference in last moment (mixed it up with opIndex for some reason). AA.remove should accept a pointer, too.
      void remove(ref Element elem) { /* removes an element from an AA  
 */ }
      void remove(K key) { remove(key in this); }

      AARange!(K,V) opSlice() { /* iterates over both keys and values */  
 }
 }

 Last, I believe foreach loop should automatically call opSlice() on  
 iteratee. There is currently an inconsistency with built-in types - you  
 don't have to call [] on them, yet you must call it on all the other  
 types:

 // fine if array is T[] or K[V]
 foreach (i; array) { ... }

 // opSlice() is explicit and mandatory for user-defined containers  
 because they are not ranges.
 foreach (i; container[]) { ... }

 Thanks!


Oct 28 2009
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Denis Koroskin wrote:
 On Wed, 28 Oct 2009 17:22:00 +0300, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:
 
 Walter has magically converted his work on T[new] into work on making 
 associative arrays true templates defined in druntime and not 
 considered very special by the compiler.

Wow, this is outstanding! (I hope it didn't have any negative impact on compile-time AA capabilities).
 This is very exciting because it opens up or simplifies a number of 
 possibilities. One is that of implementing true iteration. I actually 
 managed to implement last night something that allows you to do:

 int[int] aa = [ 1:1 ];
 auto iter = aa.each;
 writeln(iter.front.key);
 writeln(iter.front.value);

 Two other iterations are possible: by key and by value (in those cases 
 iter.front just returns a key or a value).

 One question is, what names should these bear? I am thinking of makign 
 opSlice() a universal method of getting the "all" iterator, a default 
 that every container must implement.

 For AAs, there would be a "iterate keys" and "iterate values" 
 properties or functions. How should they be called?


 Thanks,

 Andrei

If AA is providing a way to iterate over both keys and values (and it's a default iteration scheme), why should AA provide 2 other iteration schemes? Can't they be implemented externally (using adaptor ranges) with the same efficiency? foreach (e; keys(aa)) { writefln("key: %s", e); } foreach (e; values(aa)) { writefln("value: %s", e); }

Of course. In fact, given the iterator with .key and .value, you can always apply map!"a.key" or map!"a.value" to select the desired member.
 I'd also like you to add a few things in an AA interface.
 
 First, opIn should not return a pointer to Value, but a pointer to a 
 pair of Key and Value, if possible (i.e. if this change won't sacrifice 
 performance).

I'm coy about adding that because it forces the implementation to hold keys and values next to each other. I think that was a minor mistake of STL - there's too much exposure of layout details.
 Second, AA.remove method should accept result of opIn operation to avoid 
 an additional lookup for removal:
 
 if (auto value = key in aa) {
     aa.remove(key); // an unnecessary lookup
 }

I'll make aa.remove(key) always work and return a bool that tells you whether there was a mapping or not.
 Something like this would be perfect:
 
 struct Element(K,V)
 {
     const K key;
     V value;
 }
 
 struct AA(K,V)
 {
     //...
     ref Element opIn(K key) { /* throws an exception if element is not 
 found */ }
     void remove(ref Element elem) { /* removes an element from an AA */ }
     void remove(K key) { remove(key in this); }
 
     AARange!(K,V) opSlice() { /* iterates over both keys and values */ }
 }
 
 Last, I believe foreach loop should automatically call opSlice() on 
 iteratee.

foreach in D2 should already call opSlice() whenever it's defined. If it doesn't, that's a bug in the compiler. Andrei
Oct 28 2009
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Denis Koroskin wrote:
 On Wed, 28 Oct 2009 23:18:08 +0300, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:
 
 I'd also like you to add a few things in an AA interface.
  First, opIn should not return a pointer to Value, but a pointer to a 
 pair of Key and Value, if possible (i.e. if this change won't 
 sacrifice performance).

I'm coy about adding that because it forces the implementation to hold keys and values next to each other. I think that was a minor mistake of STL - there's too much exposure of layout details.

It doesn't have to be the case: key and value are both properties (i.e. methods), and they doesn't have to be located next to each other.

I see. So you want a pointer to an elaborate type featuring a key and a value.
 Second, AA.remove method should accept result of opIn operation to 
 avoid an additional lookup for removal:
  if (auto value = key in aa) {
     aa.remove(key); // an unnecessary lookup
 }

I'll make aa.remove(key) always work and return a bool that tells you whether there was a mapping or not.

Err... How does it solve the double lookup problem?

Your test looks something up and then removes it. Andrei
Oct 28 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Denis Koroskin wrote:
 On Thu, 29 Oct 2009 03:08:34 +0300, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:
 
 Denis Koroskin wrote:
 On Wed, 28 Oct 2009 23:18:08 +0300, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:

 I'd also like you to add a few things in an AA interface.
  First, opIn should not return a pointer to Value, but a pointer to 
 a pair of Key and Value, if possible (i.e. if this change won't 
 sacrifice performance).

I'm coy about adding that because it forces the implementation to hold keys and values next to each other. I think that was a minor mistake of STL - there's too much exposure of layout details.

(i.e. methods), and they doesn't have to be located next to each other.

I see. So you want a pointer to an elaborate type featuring a key and a value.
 Second, AA.remove method should accept result of opIn operation to 
 avoid an additional lookup for removal:
  if (auto value = key in aa) {
     aa.remove(key); // an unnecessary lookup
 }

I'll make aa.remove(key) always work and return a bool that tells you whether there was a mapping or not.


Your test looks something up and then removes it. Andrei

Well, my extended test case looks something up, manipulates the found value, and then possibly removes it.

Ok, I understand your points, thanks for explaining. Andrei
Oct 28 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Leandro Lucarella wrote:
 Andrei Alexandrescu, el 28 de octubre a las 20:29 me escribiste:
 Your test looks something up and then removes it.


 Andrei

found value, and then possibly removes it.


What about and overload of remove() like this: bool remove(in T key, out U value); If the element was present, it's returned in "value", so you can manipulate it. I thought about just returning a pointer: U* remove(in T key); But I guess that pointer would point to the element stored in the the AA private data, but that element was just removed, so bad things would happen, that's why the only option is to copy the data, right?

I think this all is overdoing it. First, I disagree that remove should ever throw an exception. It's not a code that the user is supposed to check (with dire consequences if she doesn't), it's just additional information just in case you need it. I think bool remove(key) is better than all other designs suggested so far. Andrei
Oct 29 2009
next sibling parent reply KennyTM~ <kennytm gmail.com> writes:
On Oct 29, 09 23:59, Bill Baxter wrote:
 On Thu, Oct 29, 2009 at 8:39 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org>  wrote:
 Leandro Lucarella wrote:
 Andrei Alexandrescu, el 28 de octubre a las 20:29 me escribiste:
 Your test looks something up and then removes it.


 Andrei

Well, my extended test case looks something up, manipulates the found value, and then possibly removes it.

Ok, I understand your points, thanks for explaining.

What about and overload of remove() like this: bool remove(in T key, out U value); If the element was present, it's returned in "value", so you can manipulate it. I thought about just returning a pointer: U* remove(in T key); But I guess that pointer would point to the element stored in the the AA private data, but that element was just removed, so bad things would happen, that's why the only option is to copy the data, right?

I think this all is overdoing it. First, I disagree that remove should ever throw an exception. It's not a code that the user is supposed to check (with dire consequences if she doesn't), it's just additional information just in case you need it. I think bool remove(key) is better than all other designs suggested so far.

I agree with the folks who say it's error-prone. I can just see myself now removing a key I know is in the dictionary and being baffled when my program fails somewhere later on because I typed aa.remove("theKey") when it should have been aa.remove("thekey"). I knew it was there so I didn't want to clutter up my code with a check for it.

Um what? aa["theKey"] = 1 doesn't fail, why should aa.remove("theKey") be special? (Meanwhile, C++'s map::erase returns the number of elements removed, C#'s (CLR in general) Dictionary.Remove returns a bool, Javascript's delete x[y] returns a bool, Perl, Ruby, Objective-C, Java, etc. all won't throw exceptions (Python being the only exceptional case). Do all languages clutter up with checks?)
 So the advice would be to always check to make sure the key you remove
 got removed to be on the safe side.   If that's the case I'd rather
 just have the darn thing throw an exception so I don't have to bother
 to write things like
       if (!aa.remove("theKey")) assert(false);
 everywhere just to be on the safe side.

void discard(K,V)(ref V[K] aa, in K key) { if (!aa.remove(key)) assert(false); } ... aa.discard("theKey");
 --bb

Oct 29 2009
parent reply bearophile <bearophileHUGS lycos.com> writes:
KennyTM~:

 Um what? aa["theKey"] = 1 doesn't fail, why should aa.remove("theKey") 
 be special?

That's a different situation. You probably meant to say: If aa["theKey"]++; doesn't fail, why should aa.remove("theKey") be special?
 void discard(K,V)(ref V[K] aa, in K key) {
    if (!aa.remove(key)) assert(false);
 }

This is stupid. Part of the point of built-in AAs is to avoid to import things. If I need to import (or worse define) that discard template in all programs where I use an AA, then I will create my own AAs and I'll just import and use them in the first place. Bye, bearophile
Oct 29 2009
parent reply KennyTM~ <kennytm gmail.com> writes:
On Oct 30, 09 01:14, bearophile wrote:
 KennyTM~:

 Um what? aa["theKey"] = 1 doesn't fail, why should aa.remove("theKey")
 be special?

That's a different situation. You probably meant to say: If aa["theKey"]++; doesn't fail, why should aa.remove("theKey") be special?
 void discard(K,V)(ref V[K] aa, in K key) {
     if (!aa.remove(key)) assert(false);
 }

This is stupid. Part of the point of built-in AAs is to avoid to import things. If I need to import (or worse define) that discard template in all programs where I use an AA, then I will create my own AAs and I'll just import and use them in the first place. Bye, bearophile

This does not contradicts with .remove() not throwing an exception when a key is not found. If you like you can put it in object.d. (Moreover, having .remove() to throw means you can't delete any dictionary items in nothrow functions. Sure you can silent it with try/catch but that's expensive.)
Oct 29 2009
parent bearophile <bearophileHUGS lycos.com> writes:
KennyTM~:

 (Moreover, having .remove() to throw means you can't delete any 
 dictionary items in nothrow functions. Sure you can silent it with 
 try/catch but that's expensive.)

In nothrow functions you can use a different method, like "discard" (or a similar name less intuitive than remove), that's like remove, but it doesn't throw and just returns false when the key was absent. The idea is to use the safer method by default and the less safe one as a performance optimization (or where you are sure you want that semantics) in the other places. Bye, bearophile
Oct 29 2009
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Bill Baxter wrote:
 On Thu, Oct 29, 2009 at 8:39 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Leandro Lucarella wrote:
 Andrei Alexandrescu, el 28 de octubre a las 20:29 me escribiste:
 Your test looks something up and then removes it.


 Andrei

found value, and then possibly removes it.


bool remove(in T key, out U value); If the element was present, it's returned in "value", so you can manipulate it. I thought about just returning a pointer: U* remove(in T key); But I guess that pointer would point to the element stored in the the AA private data, but that element was just removed, so bad things would happen, that's why the only option is to copy the data, right?

throw an exception. It's not a code that the user is supposed to check (with dire consequences if she doesn't), it's just additional information just in case you need it. I think bool remove(key) is better than all other designs suggested so far.

I agree with the folks who say it's error-prone. I can just see myself now removing a key I know is in the dictionary and being baffled when my program fails somewhere later on because I typed aa.remove("theKey") when it should have been aa.remove("thekey"). I knew it was there so I didn't want to clutter up my code with a check for it.

I don't find this reasonable. If you know removal must have succeeded, just type enforce(aa.remove("theKey")). I don't think that's the overwhelmingly common case though, and if it's, say, about 50/50, then it's much more sensible to have a non-throwing primitive than a throwing one. And it looks like defining two primitives just to save a call to enforce is not a good design. After all, if you argue people forget and misspell things and all, I could argue they call the wrong function out of two with very similar charters. Honest, I just read a couple of posts proposing two primitives and for the life of me I already can't remember which was throwing and which wasn't.
 So the advice would be to always check to make sure the key you remove
 got removed to be on the safe side.

I'm not seeing how that comes about. The advice is to check if you care, which is common sense. I'm wondering how such an obvious design came into question. Andrei
Oct 29 2009
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Leandro Lucarella wrote:
 Andrei Alexandrescu, el 29 de octubre a las 12:33 me escribiste:
 Bill Baxter wrote:
 I think bool remove(key) is better than all other designs suggested so far.

myself now removing a key I know is in the dictionary and being baffled when my program fails somewhere later on because I typed aa.remove("theKey") when it should have been aa.remove("thekey"). I knew it was there so I didn't want to clutter up my code with a check for it.

succeeded, just type enforce(aa.remove("theKey")). I don't think

I don't agree, this is like saying you should bound-check every array access. I think it's nice to have safety by default.

remove from an associative array is not unsafe.
 If you are expecting
 to have an error, you will be extra careful to spell the key correctly.
 This should cover the cases where you are distracted and not expecting any
 errors.
 
 I think the check could be done in non-release mode only though, just
 like bound checking.

Bounds checking is eliminated in release mode solely for efficiency reasons. There is no other good reason to eliminate checks. But in the case of remove, the amount of work done is consistent enough to make a check negligible.
 that's the overwhelmingly common case though, and if it's, say,
 about 50/50, then it's much more sensible to have a non-throwing
 primitive than a throwing one. And it looks like defining two
 primitives just to save a call to enforce is not a good design.

This is one case where I think practicality beats purity, because the whole point of throwing an exception is for the cases you don't expect it to fail. Again, think of array bounds, you can force the programmer to add lots of enforce() in the code and remove bound checking from the compiler.

I can't think of array bounds. The situations are completely unrelated. Andrei
Oct 29 2009
prev sibling next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:

 I'll make aa.remove(key) always work and return a bool that tells you 
 whether there was a mapping or not.

I think that's a small design mistake. In a high level language you want things to not fail silently. You want them to fail in an explicit way because programmers often forget to read and use return values. So AAs may have two methods, "remove" and "drop" (Python sets use "remove" and "discard" for this). The "remove" can be the safer one and used by default in D programs (especially in SafeD modules, safety is in things like this too), that raises an exception when you try to remove a missing key. "drop/discard" is faster and silent, it removes the key if it's present, as you want. Bye, bearophile
Oct 29 2009
parent reply =?ISO-8859-1?Q?Pelle_M=E5nsson?= <pelle.mansson gmail.com> writes:
bearophile wrote:
 Andrei Alexandrescu:
 
 I'll make aa.remove(key) always work and return a bool that tells you 
 whether there was a mapping or not.

I think that's a small design mistake. In a high level language you want things to not fail silently. You want them to fail in an explicit way because programmers often forget to read and use return values. So AAs may have two methods, "remove" and "drop" (Python sets use "remove" and "discard" for this). The "remove" can be the safer one and used by default in D programs (especially in SafeD modules, safety is in things like this too), that raises an exception when you try to remove a missing key. "drop/discard" is faster and silent, it removes the key if it's present, as you want. Bye, bearophile

I agree with this. I usually want exceptions.
Oct 29 2009
parent reply bearophile <bearophileHUGS lycos.com> writes:
Pelle Månsson:
 I agree with this. I usually want exceptions.

A problem with this is that currently exceptions are very slow in D compiled with DMD (something like up to 11 times slower than Java exceptions, about 2-4 times slower than Python exceptions. I have a benchmark for this on my site). Bye, bearophile
Oct 29 2009
parent bearophile <bearophileHUGS lycos.com> writes:
bearophile:

 Pelle Månsson:
 I agree with this. I usually want exceptions.

A problem with this is that currently exceptions are very slow in D compiled with DMD (something like up to 11 times slower than Java exceptions, about 2-4 times slower than Python exceptions. I have a benchmark for this on my site).

I have just redone the timings with LDC on Ubuntu (I'll put the timings online soon): Java is only about 2.2 times faster. This is much better and almost acceptable. Bye, bearophile
Oct 29 2009
prev sibling next sibling parent Bill Baxter <wbaxter gmail.com> writes:
On Thu, Oct 29, 2009 at 9:57 AM, KennyTM~ <kennytm gmail.com> wrote:
 On Oct 29, 09 23:59, Bill Baxter wrote:
 On Thu, Oct 29, 2009 at 8:39 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> =A0wrote:
 Leandro Lucarella wrote:
 Andrei Alexandrescu, el 28 de octubre a las 20:29 me escribiste:
 Your test looks something up and then removes it.


 Andrei

Well, my extended test case looks something up, manipulates the found value, and then possibly removes it.

Ok, I understand your points, thanks for explaining.

What about and overload of remove() like this: bool remove(in T key, out U value); If the element was present, it's returned in "value", so you can manipulate it. I thought about just returning a pointer: U* remove(in T key); But I guess that pointer would point to the element stored in the the =




 private data, but that element was just removed, so bad things would
 happen, that's why the only option is to copy the data, right?

I think this all is overdoing it. First, I disagree that remove should ever throw an exception. It's not a code that the user is supposed to check (with dire consequences if she doesn't), it's just additional information jus=



 in
 case you need it.

 I think bool remove(key) is better than all other designs suggested so
 far.

I agree with the folks who say it's error-prone. =A0I can just see myself now removing a key I know is in the dictionary and being baffled when my program fails somewhere later on because I typed aa.remove("theKey") when it should have been aa.remove("thekey"). =A0I knew it was there so I didn't want to clutter up my code with a check for it.

Um what? aa["theKey"] =3D 1 doesn't fail, why should aa.remove("theKey") =

 special?

Huh? aa["theKey"] also doesn't delete entries, so why should it have any bearing on what remove does? The closest analogy would be the case where aa["theKey"]=3D1 can't do what you ask it to, which would be an out-of-memory error, which does generate an error of some kind, I think.
 (Meanwhile, C++'s map::erase returns the number of elements removed,

Erase is also used for multimaps, so it's not exactly apples to apples. Your other examples are ok AFAIK.
 void discard(K,V)(ref V[K] aa, in K key) {
 =A0if (!aa.remove(key)) assert(false);
 }

 ...

 aa.discard("theKey");

Yep, that would be perfect if it were part of the standard AA interface. That would actually be my preferred solution, to have two methods for removing elements from AAs. One that throws and one that doesn't. --bb
Oct 29 2009
prev sibling next sibling parent Bill Baxter <wbaxter gmail.com> writes:
On Thu, Oct 29, 2009 at 10:33 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
 Bill Baxter wrote:
 On Thu, Oct 29, 2009 at 8:39 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Leandro Lucarella wrote:
 Andrei Alexandrescu, el 28 de octubre a las 20:29 me escribiste:
 Your test looks something up and then removes it.


 Andrei

Well, my extended test case looks something up, manipulates the found value, and then possibly removes it.

Ok, I understand your points, thanks for explaining.

What about and overload of remove() like this: bool remove(in T key, out U value); If the element was present, it's returned in "value", so you can manipulate it. I thought about just returning a pointer: U* remove(in T key); But I guess that pointer would point to the element stored in the the =




 private data, but that element was just removed, so bad things would
 happen, that's why the only option is to copy the data, right?

I think this all is overdoing it. First, I disagree that remove should ever throw an exception. It's not a code that the user is supposed to check (with dire consequences if she doesn't), it's just additional information jus=



 in
 case you need it.

 I think bool remove(key) is better than all other designs suggested so
 far.

I agree with the folks who say it's error-prone. =A0I can just see myself now removing a key I know is in the dictionary and being baffled when my program fails somewhere later on because I typed aa.remove("theKey") when it should have been aa.remove("thekey"). =A0I knew it was there so I didn't want to clutter up my code with a check for it.

I don't find this reasonable. If you know removal must have succeeded, ju=

 type enforce(aa.remove("theKey")). I don't think that's the overwhelmingl=

 common case though, and if it's, say, about 50/50, then it's much more
 sensible to have a non-throwing primitive than a throwing one. And it loo=

 like defining two primitives just to save a call to enforce is not a good
 design. After all, if you argue people forget and misspell things and all=

 could argue they call the wrong function out of two with very similar
 charters. Honest, I just read a couple of posts proposing two primitives =

 for the life of me I already can't remember which was throwing and which
 wasn't.

 So the advice would be to always check to make sure the key you remove
 got removed to be on the safe side.

I'm not seeing how that comes about. The advice is to check if you care, which is common sense. I'm wondering how such an obvious design came into question.

Ok. You convinced me that your way doesn't suck. And you can ask the Python designers how the so-called "obvious" design came into question. Apparently it's not as obvious as you think. And I would suggest from experience with Python that having two functions is also not as bad as you think. But I'm convinced that having two functions is not significantly better than having one function that doesn't throw. --bb
Oct 29 2009
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
rmcguire wrote:
 Wouldn't opIn be more useful if it returned a range starting with
 the element that was found?

Thought about that, but it's hard to justify. Items aren't sorted in any particular order, so you'd get pretty much a random bunch of stuff. Andrei
Oct 29 2009
prev sibling next sibling parent reply "Denis Koroskin" <2korden gmail.com> writes:
On Wed, 28 Oct 2009 23:18:08 +0300, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 I'd also like you to add a few things in an AA interface.
  First, opIn should not return a pointer to Value, but a pointer to a  
 pair of Key and Value, if possible (i.e. if this change won't sacrifice  
 performance).

I'm coy about adding that because it forces the implementation to hold keys and values next to each other. I think that was a minor mistake of STL - there's too much exposure of layout details.

It doesn't have to be the case: key and value are both properties (i.e. methods), and they doesn't have to be located next to each other.
 Second, AA.remove method should accept result of opIn operation to  
 avoid an additional lookup for removal:
  if (auto value = key in aa) {
     aa.remove(key); // an unnecessary lookup
 }

I'll make aa.remove(key) always work and return a bool that tells you whether there was a mapping or not.

Err... How does it solve the double lookup problem?
Oct 28 2009
next sibling parent Bill Baxter <wbaxter gmail.com> writes:
On Thu, Oct 29, 2009 at 11:16 AM, Leandro Lucarella <llucax gmail.com> wrot=
e:
 Andrei Alexandrescu, el 29 de octubre a las 12:33 me escribiste:
 Bill Baxter wrote:
I think bool remove(key) is better than all other designs suggested so=




I agree with the folks who say it's error-prone. =A0I can just see
myself now removing a key I know is in the dictionary and being
baffled when my program fails somewhere later on because I typed
aa.remove("theKey") when it should have been aa.remove("thekey"). =A0I
knew it was there so I didn't want to clutter up my code with a check
for it.

I don't find this reasonable. If you know removal must have succeeded, just type enforce(aa.remove("theKey")). I don't think

I don't agree, this is like saying you should bound-check every array access. I think it's nice to have safety by default. If you are expecting to have an error, you will be extra careful to spell the key correctly. This should cover the cases where you are distracted and not expecting an=

 errors.

 I think the check could be done in non-release mode only though, just
 like bound checking.

 that's the overwhelmingly common case though, and if it's, say,
 about 50/50, then it's much more sensible to have a non-throwing
 primitive than a throwing one. And it looks like defining two
 primitives just to save a call to enforce is not a good design.

This is one case where I think practicality beats purity, because the whole point of throwing an exception is for the cases you don't expect it to fail. Again, think of array bounds, you can force the programmer to ad=

 lots of enforce() in the code and remove bound checking from the compiler=

The problem with this argument (which I was also about to make) is that accessing an array out of bounds is almost never normal behavior. But for deleting things, it often is. Think also about delete itself. In C++ and D delete on a null pointer is fine, and is defined to be a no-op. This was done explicitly to avoid always having to do if(p) delete p; all the time. And really it works out just fine in practice. If we're going to keep arguing that remove() should throw, then really delete should too. --bb
Oct 29 2009
prev sibling next sibling parent Leandro Lucarella <llucax gmail.com> writes:
KennyTM~, el 30 de octubre a las 02:55 me escribiste:
 On Oct 30, 09 01:14, bearophile wrote:
KennyTM~:

Um what? aa["theKey"] = 1 doesn't fail, why should aa.remove("theKey")
be special?

That's a different situation. You probably meant to say: If aa["theKey"]++; doesn't fail, why should aa.remove("theKey") be special?
void discard(K,V)(ref V[K] aa, in K key) {
    if (!aa.remove(key)) assert(false);
}

This is stupid. Part of the point of built-in AAs is to avoid to import things. If I need to import (or worse define) that discard template in all programs where I use an AA, then I will create my own AAs and I'll just import and use them in the first place. Bye, bearophile

This does not contradicts with .remove() not throwing an exception when a key is not found. If you like you can put it in object.d. (Moreover, having .remove() to throw means you can't delete any dictionary items in nothrow functions. Sure you can silent it with try/catch but that's expensive.)

Not if it's an Error instead of an exception. I think that should be the case, .remove() check should be like bound check, something done at non-release mode, not something to use in the regular flow of a program to avoid using opIn(). -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- Esta desenchufada la internet de ese televisor? -- Osvaldo Lucarella
Oct 29 2009
prev sibling next sibling parent Leandro Lucarella <llucax gmail.com> writes:
Bill Baxter, el 29 de octubre a las 11:31 me escribiste:
 On Thu, Oct 29, 2009 at 11:16 AM, Leandro Lucarella <llucax gmail.com> wrote:
 that's the overwhelmingly common case though, and if it's, say,
 about 50/50, then it's much more sensible to have a non-throwing
 primitive than a throwing one. And it looks like defining two
 primitives just to save a call to enforce is not a good design.

This is one case where I think practicality beats purity, because the whole point of throwing an exception is for the cases you don't expect it to fail. Again, think of array bounds, you can force the programmer to add lots of enforce() in the code and remove bound checking from the compiler.

The problem with this argument (which I was also about to make) is that accessing an array out of bounds is almost never normal behavior. But for deleting things, it often is. Think also about delete itself. In C++ and D delete on a null pointer is fine, and is defined to be a no-op. This was done explicitly to avoid always having to do if(p) delete p; all the time. And really it works out just fine in practice. If we're going to keep arguing that remove() should throw, then really delete should too.

I don't think that's the case. If you have a wrong value in a pointer you want to delete, you have 2 posibilities: 1) Is a valid pointer, in which case you are doomed, a wrong object would be deleted and your program is fucked. 2) Is an invalid pointer, in which case the program could explode inmediately. I don't know what happens right now, but an assertion in this case sould be useful too. NULL pointers are not a problem, because it rarely indicates an error in the program. delete NULL is useful for things like: T* x = null; if (blah) x = bleh(); foo(x); delete x; This might ever be a valid use case for AA too, and in that case I don't really mind aa.remove(null) being a NOP. The issue is with WRONG KEYS, not null ones. If you messed up with your key (as well if you messed up with your pointer in delete), *bad* things will happen, and even when it's not "memory corruption" per se (I think the definition of SafeD is avoiding memory corruption only), it's close to undefined behaviour, your program will start failing in very strange ways and the bug will be too hard to track. It's basically the same problem with dynamic languages that doesn't even need to "declare" a variable, like perl or php. So I think yes, remove should assert() if the key is not found (unless you explicitly inform in some way that you don't want it to fail) and delete should assert() if the pointer passed is not a pointer to a start of a HEAP block. In either case is a bug, and I want to be notified about it as soon as possible (at least in non-release mode). -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- Sé que tu me miras, pero yo me juraría que, en esos ojos negros que tenés, hay un indio sensible que piensa: "Qué bárbaro que este tipo blanco esté tratando de comunicarse conmigo que soy un ser inferior en la escala del homo sapiens". Por eso, querido indio, no puedo dejar de mirarte como si fueras un cobayo de mierda al que puedo pisar cuando quiera. -- Ricardo Vaporeso. Carta a los aborígenes, ed. Gredos, Barcelona, 1912, página 102.
Oct 29 2009
prev sibling parent Leandro Lucarella <llucax gmail.com> writes:
Andrei Alexandrescu, el 29 de octubre a las 13:26 me escribiste:
 Leandro Lucarella wrote:
Andrei Alexandrescu, el 29 de octubre a las 12:33 me escribiste:
Bill Baxter wrote:
I think bool remove(key) is better than all other designs suggested so far.

myself now removing a key I know is in the dictionary and being baffled when my program fails somewhere later on because I typed aa.remove("theKey") when it should have been aa.remove("thekey"). I knew it was there so I didn't want to clutter up my code with a check for it.

succeeded, just type enforce(aa.remove("theKey")). I don't think

I don't agree, this is like saying you should bound-check every array access. I think it's nice to have safety by default.

remove from an associative array is not unsafe.

In the sense of "memory corruption" that's true. But your program will be at least leaking memory if you thought the key was present in the AA (the majority of the use cases for remove(), because come on, if you are removing something is because you think there are very good chances that the key is in the AA :). If your program then iterate the AA and there is a key that it doesn't supposed to be there, it will blow in very original, hard to track ways. What's the point of that? If you want to remove something you're not sure that it really is in the AA, you should use a safer method. Maybe try_remove(), or an extra parameter to remove() or maybe just ask first if key in aa (but this have an extra lookup, and I think that's the whole point that triggered this discussion).
If you are expecting
to have an error, you will be extra careful to spell the key correctly.
This should cover the cases where you are distracted and not expecting any
errors.

I think the check could be done in non-release mode only though, just
like bound checking.

Bounds checking is eliminated in release mode solely for efficiency reasons. There is no other good reason to eliminate checks. But in

Well, same for removing an inexistent key from an AA then :)
 the case of remove, the amount of work done is consistent enough to
 make a check negligible.

Ok, but make it an "Error", not an exception, so it can be used in nothrow functions. Removing an nonexistent key from an AA should be a programming error, except if you state in some way that you know that maybe the key is not there (which should be rare).
that's the overwhelmingly common case though, and if it's, say,
about 50/50, then it's much more sensible to have a non-throwing
primitive than a throwing one. And it looks like defining two
primitives just to save a call to enforce is not a good design.

This is one case where I think practicality beats purity, because the whole point of throwing an exception is for the cases you don't expect it to fail. Again, think of array bounds, you can force the programmer to add lots of enforce() in the code and remove bound checking from the compiler.

I can't think of array bounds. The situations are completely unrelated.

I don't think they are *completely* unrelated. I agree is not exactly the same because array bound error automatically imply memory corruption and nonexistent key removal don't, but they are a bug in the vast majority of the cases. At least this is my experience. -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- Fitter, happier, more productive, comfortable, not drinking too much, regular exercise at the gym (3 days a week), getting on better with your associate employee contemporaries,
Oct 29 2009
prev sibling next sibling parent reply "Denis Koroskin" <2korden gmail.com> writes:
On Thu, 29 Oct 2009 03:08:34 +0300, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Denis Koroskin wrote:
 On Wed, 28 Oct 2009 23:18:08 +0300, Andrei Alexandrescu  
 <SeeWebsiteForEmail erdani.org> wrote:

 I'd also like you to add a few things in an AA interface.
  First, opIn should not return a pointer to Value, but a pointer to a  
 pair of Key and Value, if possible (i.e. if this change won't  
 sacrifice performance).

I'm coy about adding that because it forces the implementation to hold keys and values next to each other. I think that was a minor mistake of STL - there's too much exposure of layout details.

(i.e. methods), and they doesn't have to be located next to each other.

I see. So you want a pointer to an elaborate type featuring a key and a value.
 Second, AA.remove method should accept result of opIn operation to  
 avoid an additional lookup for removal:
  if (auto value = key in aa) {
     aa.remove(key); // an unnecessary lookup
 }

I'll make aa.remove(key) always work and return a bool that tells you whether there was a mapping or not.


Your test looks something up and then removes it. Andrei

Well, my extended test case looks something up, manipulates the found value, and then possibly removes it.
Oct 28 2009
parent Leandro Lucarella <llucax gmail.com> writes:
Simen Kjaeraas, el 29 de octubre a las 22:06 me escribiste:
 bearophile <bearophileHUGS lycos.com> wrote:
 
In nothrow functions you can use a different method, like
"discard" (or a similar name less intuitive than remove), that's
like remove, but it doesn't throw and just returns false when the
key was absent.
The idea is to use the safer method by default and the less safe
one as a performance optimization (or where you are sure you want
that semantics) in the other places.

Please elucidate, what is unsafe about deleting something that isn't there?

A leak, if the key was wrong (not if the key was right but deleted before). -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- Soñé que tenia un caballo Que me trataba mejor que vos Tenia tan buena onda con ella Era mi yegua, mucho mas que vos
Oct 29 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 28 Oct 2009 20:08:34 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Denis Koroskin wrote:
 On Wed, 28 Oct 2009 23:18:08 +0300, Andrei Alexandrescu  
 <SeeWebsiteForEmail erdani.org> wrote:

 I'd also like you to add a few things in an AA interface.
  First, opIn should not return a pointer to Value, but a pointer to a  
 pair of Key and Value, if possible (i.e. if this change won't  
 sacrifice performance).

I'm coy about adding that because it forces the implementation to hold keys and values next to each other. I think that was a minor mistake of STL - there's too much exposure of layout details.

(i.e. methods), and they doesn't have to be located next to each other.

I see. So you want a pointer to an elaborate type featuring a key and a value.

It could return a struct holding both pointers, instead of a pointer to a struct. I don't see this as unreasonable (in fact, it's what I do in dcollections, see the cursor types). An implementation which stores the two close together in actuality may only require one pointer. The only drawback of this is it can't be part of a formal interface, since the return type is defined by the derived type. -Steve
Oct 29 2009
prev sibling next sibling parent Bill Baxter <wbaxter gmail.com> writes:
On Thu, Oct 29, 2009 at 8:39 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
 Leandro Lucarella wrote:
 Andrei Alexandrescu, el 28 de octubre a las 20:29 me escribiste:
 Your test looks something up and then removes it.


 Andrei

Well, my extended test case looks something up, manipulates the found value, and then possibly removes it.

Ok, I understand your points, thanks for explaining.

What about and overload of remove() like this: bool remove(in T key, out U value); If the element was present, it's returned in "value", so you can manipulate it. I thought about just returning a pointer: U* remove(in T key); But I guess that pointer would point to the element stored in the the AA private data, but that element was just removed, so bad things would happen, that's why the only option is to copy the data, right?

I think this all is overdoing it. First, I disagree that remove should ever throw an exception. It's not a code that the user is supposed to check (with dire consequences if she doesn't), it's just additional information just in case you need it. I think bool remove(key) is better than all other designs suggested so far.

I agree with the folks who say it's error-prone. I can just see myself now removing a key I know is in the dictionary and being baffled when my program fails somewhere later on because I typed aa.remove("theKey") when it should have been aa.remove("thekey"). I knew it was there so I didn't want to clutter up my code with a check for it. So the advice would be to always check to make sure the key you remove got removed to be on the safe side. If that's the case I'd rather just have the darn thing throw an exception so I don't have to bother to write things like if (!aa.remove("theKey")) assert(false); everywhere just to be on the safe side. --bb
Oct 29 2009
prev sibling next sibling parent Leandro Lucarella <llucax gmail.com> writes:
Andrei Alexandrescu, el 29 de octubre a las 12:33 me escribiste:
 Bill Baxter wrote:
I think bool remove(key) is better than all other designs suggested so far.

I agree with the folks who say it's error-prone. I can just see myself now removing a key I know is in the dictionary and being baffled when my program fails somewhere later on because I typed aa.remove("theKey") when it should have been aa.remove("thekey"). I knew it was there so I didn't want to clutter up my code with a check for it.

I don't find this reasonable. If you know removal must have succeeded, just type enforce(aa.remove("theKey")). I don't think

I don't agree, this is like saying you should bound-check every array access. I think it's nice to have safety by default. If you are expecting to have an error, you will be extra careful to spell the key correctly. This should cover the cases where you are distracted and not expecting any errors. I think the check could be done in non-release mode only though, just like bound checking.
 that's the overwhelmingly common case though, and if it's, say,
 about 50/50, then it's much more sensible to have a non-throwing
 primitive than a throwing one. And it looks like defining two
 primitives just to save a call to enforce is not a good design.

This is one case where I think practicality beats purity, because the whole point of throwing an exception is for the cases you don't expect it to fail. Again, think of array bounds, you can force the programmer to add lots of enforce() in the code and remove bound checking from the compiler. -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- EXPOSICION INTERNACIONAL DE INODOROS -- Crónica TV
Oct 29 2009
prev sibling next sibling parent rmcguire <rjmcguire gmail.com> writes:
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:
 
 Denis Koroskin wrote:
 On Wed, 28 Oct 2009 17:22:00 +0300, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:
 
 Walter has magically converted his work on T[new] into work on making 
 associative arrays true templates defined in druntime and not 
 considered very special by the compiler.

Wow, this is outstanding! (I hope it didn't have any negative impact on compile-time AA capabilities).
 This is very exciting because it opens up or simplifies a number of 
 possibilities. One is that of implementing true iteration. I actually 
 managed to implement last night something that allows you to do:

 int[int] aa = [ 1:1 ];
 auto iter = aa.each;
 writeln(iter.front.key);
 writeln(iter.front.value);

 Two other iterations are possible: by key and by value (in those cases 
 iter.front just returns a key or a value).

 One question is, what names should these bear? I am thinking of makign 
 opSlice() a universal method of getting the "all" iterator, a default 
 that every container must implement.

 For AAs, there would be a "iterate keys" and "iterate values" 
 properties or functions. How should they be called?


 Thanks,

 Andrei

If AA is providing a way to iterate over both keys and values (and it's a default iteration scheme), why should AA provide 2 other iteration schemes? Can't they be implemented externally (using adaptor ranges) with the same efficiency? foreach (e; keys(aa)) { writefln("key: %s", e); } foreach (e; values(aa)) { writefln("value: %s", e); }

Of course. In fact, given the iterator with .key and .value, you can always apply map!"a.key" or map!"a.value" to select the desired member.
 I'd also like you to add a few things in an AA interface.
 
 First, opIn should not return a pointer to Value, but a pointer to a 
 pair of Key and Value, if possible (i.e. if this change won't sacrifice 
 performance).

I'm coy about adding that because it forces the implementation to hold keys and values next to each other. I think that was a minor mistake of STL - there's too much exposure of layout details.
 Second, AA.remove method should accept result of opIn operation to avoid 
 an additional lookup for removal:
 
 if (auto value = key in aa) {
     aa.remove(key); // an unnecessary lookup
 }

I'll make aa.remove(key) always work and return a bool that tells you whether there was a mapping or not.
 Something like this would be perfect:
 
 struct Element(K,V)
 {
     const K key;
     V value;
 }
 
 struct AA(K,V)
 {
     //...
     ref Element opIn(K key) { /* throws an exception if element is not 
 found */ }
     void remove(ref Element elem) { /* removes an element from an AA */ }
     void remove(K key) { remove(key in this); }
 
     AARange!(K,V) opSlice() { /* iterates over both keys and values */ }
 }
 
 Last, I believe foreach loop should automatically call opSlice() on 
 iteratee.

foreach in D2 should already call opSlice() whenever it's defined. If it doesn't, that's a bug in the compiler. Andrei

Wouldn't opIn be more useful if it returned a range starting with the element that was found?
Oct 29 2009
prev sibling next sibling parent rmcguire <rjmcguire gmail.com> writes:
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:
 
 rmcguire wrote:
 Wouldn't opIn be more useful if it returned a range starting with
 the element that was found?

Thought about that, but it's hard to justify. Items aren't sorted in any particular order, so you'd get pretty much a random bunch of stuff. Andrei

Oh, I thought that ranges were going to be applied to network/file IO. I which case there is plenty of places one might want to jump forward in a "range"... ah but then we're talking AAs. Well I suppose the Socket class could have opIn return a range. hehe Thanks for replying
Oct 29 2009
prev sibling parent "Simen Kjaeraas" <simen.kjaras gmail.com> writes:
bearophile <bearophileHUGS lycos.com> wrote:

 In nothrow functions you can use a different method, like "discard" (or  
 a similar name less intuitive than remove), that's like remove, but it  
 doesn't throw and just returns false when the key was absent.
 The idea is to use the safer method by default and the less safe one as  
 a performance optimization (or where you are sure you want that  
 semantics) in the other places.

Please elucidate, what is unsafe about deleting something that isn't there? -- Simen
Oct 29 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 28 Oct 2009 10:22:00 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Walter has magically converted his work on T[new] into work on making  
 associative arrays true templates defined in druntime and not considered  
 very special by the compiler.

 This is very exciting because it opens up or simplifies a number of  
 possibilities. One is that of implementing true iteration. I actually  
 managed to implement last night something that allows you to do:

 int[int] aa = [ 1:1 ];
 auto iter = aa.each;
 writeln(iter.front.key);
 writeln(iter.front.value);

 Two other iterations are possible: by key and by value (in those cases  
 iter.front just returns a key or a value).

 One question is, what names should these bear? I am thinking of makign  
 opSlice() a universal method of getting the "all" iterator, a default  
 that every container must implement.

 For AAs, there would be a "iterate keys" and "iterate values" properties  
 or functions. How should they be called?

This is one of the benefits of opApply. Rather than define some new names to make up for the deficiencies of range-foreach (especially so near the release of D2), why not focus on fixing the deficiencies? i.e. here is my desired interface: foreach(key, value; aa) // iterate both keys and values foreach(value; aa) // default to iterating values (to be consistent with arrays) foreach(key; aa.keys) // iterate keys lazily. If ranges cannot support this interface, I think the range paradigm needs some work. opApply works great for this since the delegate type defines the desired interface. The problem with ranges is the potential "opRange" method in it's natural form should overload on the return value, and have no parameters. Here's a possible idea: make foreach(X x, Y y, Z z,...; container) translate to foreach(X x, Y y, Z z,...; container.opRange!(X, Y, Z, ...)()). In the case of inferred types, maybe you can do something like: opRange!(T : keytype, U : valuetype)() and have the compiler infer types for x and y as keytype and valuetype. The only drawback I see is opRange can't be virtual. However, how does one define a "virtual" range return, when a range is invariably a struct? It might be a moot issue. Of course, you could always meet the virtual requirement by naming your range-generating method... So with my scheme, the AA becomes: struct AA(Key, Val) { ... // implementation auto opRange(T : Key, U : Val)() { ... } auto opRange(T : Val)() { ... } auto keys() { ... } } One other thing left to define is what foreach calls to get the key(s) in the case of multiple arguments to foreach. The value is generally accessed via front(). Might as well tackle the ref possibilities too :) -Steve
Oct 29 2009
prev sibling parent Leandro Lucarella <llucax gmail.com> writes:
Andrei Alexandrescu, el 28 de octubre a las 20:29 me escribiste:
Your test looks something up and then removes it.


Andrei

Well, my extended test case looks something up, manipulates the found value, and then possibly removes it.

Ok, I understand your points, thanks for explaining.

What about and overload of remove() like this: bool remove(in T key, out U value); If the element was present, it's returned in "value", so you can manipulate it. I thought about just returning a pointer: U* remove(in T key); But I guess that pointer would point to the element stored in the the AA private data, but that element was just removed, so bad things would happen, that's why the only option is to copy the data, right? -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- Y no es el centro del Universo! El sol gira alrededor de la tierra! Miles de girasoles no pueden estar equivocados! -- Inodoro Pereyra
Oct 29 2009