digitalmars.D - in operator generalization

Les Baker (14/14) Mar 17 2006 Looking through old NG posts I found this one by Ben Hinkle last year

BCS (4/23) Mar 17 2006 Actually this would be backwards, "in" determines if the given value is ...

Ivan Senji (7/32) Mar 17 2006 I didn't understand before nor do I understand why this was always an

BCS (9/47) Mar 17 2006 First of all, I don't have an opinion with regards to the "in" operator

Regan Heath (9/22) Mar 18 2006 I think it might be better to keep this in library code, i.e. it could b...

Les Baker (22/28) Mar 18 2006 You make a very good point, because all "in" is, is a shortcut for a

Stewart Gordon (27/31) Mar 20 2006

Ivan Senji (65/101) Mar 21 2006 To reply or not to reply :)

Stewart Gordon (41/73) Mar 22 2006 Why would you want to make life harder by using normal arrays instead?

Les Baker <les_baker REMOVEbellsouthREMOVE.net> writes:

Looking through old NG posts I found this one by Ben Hinkle last year 
about extending the "in" operator to support static/dynamic arrays.

http://www.digitalmars.com/d/archives/digitalmars/D/25164

Was this ever totally dismissed?  It just seemed to fall off the radar. 
  I'm liking the concept though; I don't know how many times I've 
written loops in quickie utility programs just to hunt for a item and 
return it.  I would also additionally suggest (if it hasn't already 
been) that "in" be overloadable so that if D supports other data 
structures in the standard library that they can use that syntax as well.

The only disadvantage I can think of is that when a developer sees "in", 
he/she can't assume it's a constant time operation anymore.  I think the 
increase in utility outweights that though.

Thoughts?

Les Baker

Mar 17 2006

BCS <BCS_member pathlink.com> writes:

Les Baker wrote:
 Looking through old NG posts I found this one by Ben Hinkle last year 
 about extending the "in" operator to support static/dynamic arrays.
 
 http://www.digitalmars.com/d/archives/digitalmars/D/25164
 
 Was this ever totally dismissed?  It just seemed to fall off the radar. 
  I'm liking the concept though; I don't know how many times I've written 
 loops in quickie utility programs just to hunt for a item and return 
 it.  I would also additionally suggest (if it hasn't already been) that 
 "in" be overloadable so that if D supports other data structures in the 
 standard library that they can use that syntax as well.
 
 The only disadvantage I can think of is that when a developer sees "in", 
 he/she can't assume it's a constant time operation anymore.  I think the 
 increase in utility outweights that though.
 
 Thoughts?
 
 Les Baker

Actually this would be backwards, "in" determines if the given value is a key 
for the AA. Using it to examine the contents would be something different than 
it's current meaning, but this might not be a bad idea.

Mar 17 2006

Ivan Senji <ivan.senji_REMOVE_ _THIS__gmail.com> writes:

BCS wrote:
 Les Baker wrote:
 Looking through old NG posts I found this one by Ben Hinkle last year
 about extending the "in" operator to support static/dynamic arrays.

 http://www.digitalmars.com/d/archives/digitalmars/D/25164

 Was this ever totally dismissed?  It just seemed to fall off the
 radar.  I'm liking the concept though; I don't know how many times
 I've written loops in quickie utility programs just to hunt for a item
 and return it.  I would also additionally suggest (if it hasn't
 already been) that "in" be overloadable so that if D supports other
 data structures in the standard library that they can use that syntax
 as well.

 The only disadvantage I can think of is that when a developer sees
 "in", he/she can't assume it's a constant time operation anymore.  I
 think the increase in utility outweights that though.

 Thoughts?

 Les Baker

 
 Actually this would be backwards, "in" determines if the given value is
 a key for the AA. Using it to examine the contents would be something
 different than it's current meaning, but this might not be a bad idea.

I didn't understand before nor do I understand why this was always an
argument against 'in' for arrays?

float[MyObject] array; -> stores MyObjects
MyObject[] 	array; -> stores MyObjects

AA are different from normal arrays. In AA's the index is the thing you
are storing, and it makes sense to search for it.

Mar 17 2006

BCS <BCS_member pathlink.com> writes:

Ivan Senji wrote:
 BCS wrote:
 
Les Baker wrote:

Looking through old NG posts I found this one by Ben Hinkle last year
about extending the "in" operator to support static/dynamic arrays.

http://www.digitalmars.com/d/archives/digitalmars/D/25164

Was this ever totally dismissed?  It just seemed to fall off the
radar.  I'm liking the concept though; I don't know how many times
I've written loops in quickie utility programs just to hunt for a item
and return it.  I would also additionally suggest (if it hasn't
already been) that "in" be overloadable so that if D supports other
data structures in the standard library that they can use that syntax
as well.

The only disadvantage I can think of is that when a developer sees
"in", he/she can't assume it's a constant time operation anymore.  I
think the increase in utility outweights that though.

Thoughts?

Les Baker

Actually this would be backwards, "in" determines if the given value is
a key for the AA. Using it to examine the contents would be something
different than it's current meaning, but this might not be a bad idea.

 
 
 I didn't understand before nor do I understand why this was always an
 argument against 'in' for arrays?
 
 float[MyObject] array; -> stores MyObjects
 MyObject[] 	array; -> stores MyObjects
 
 AA are different from normal arrays. In AA's the index is the thing you
 are storing, and it makes sense to search for it.

First of all, I don't have an opinion with regards to the "in" operator 
searching through an array.

Second, IIRC the intended use of AA is for the keys to be used to reference the 
content, not for storing the keys. However they do store the keys and would
make 
a good device to store set of things.

I think the argument you refer to comes from a desirer for orthogonality. With 
an AA the "in" operator searches the things that goes in the [], using "in" on
a 
normal array would search the things that come out when you index into the
array.

Mar 17 2006

"Regan Heath" <regan netwin.co.nz> writes:

On Fri, 17 Mar 2006 17:50:02 -0500, Les Baker  
<les_baker REMOVEbellsouthREMOVE.net> wrote:
 Looking through old NG posts I found this one by Ben Hinkle last year  
 about extending the "in" operator to support static/dynamic arrays.

 http://www.digitalmars.com/d/archives/digitalmars/D/25164

 Was this ever totally dismissed?  It just seemed to fall off the radar.  
   I'm liking the concept though; I don't know how many times I've  
 written loops in quickie utility programs just to hunt for a item and  
 return it.  I would also additionally suggest (if it hasn't already  
 been) that "in" be overloadable so that if D supports other data  
 structures in the standard library that they can use that syntax as well.

 The only disadvantage I can think of is that when a developer sees "in",  
 he/she can't assume it's a constant time operation anymore.  I think the  
 increase in utility outweights that though.

 Thoughts?

I think it might be better to keep this in library code, i.e. it could be  
achieved with a utility functions or template. i.e. a template that does a  
binary search of a (presumed sorted) object which can be indexed with [],  
etc. It could then be used on dynamic/static arrays and other container  
style objects. This sort of thing is what I'd expect to see in the  
proposed DTL library.

Regan

Mar 18 2006

Les Baker <les_baker REMOVEbellsouthREMOVE.net> writes:

 I think it might be better to keep this in library code, i.e. it could 
 be  achieved with a utility functions or template. i.e. a template that 
 does a  binary search of a (presumed sorted) object which can be indexed 
 with [],  etc. It could then be used on dynamic/static arrays and other 
 container  style objects. This sort of thing is what I'd expect to see 
 in the  proposed DTL library.

You make a very good point, because all "in" is, is a shortcut for a 
method call.  But why even have "in" in the first place, if it's just 
has one limited usage (looking up a key in an AA)?  Looking at Python, 
IIRC, they allow "in" on lists and tuples, and allow overloading on it 
(__contains__).  Why not D?  (Actually D's seems better -- you get the 
object as a result -- Python's "in" just gives a boolean stating whether 
the list/dictionary contains the object.  Again, that's just from 
memory, could be wrong)

I'm digressing though, although I do desire an increase in orthogonality 
(Additionally, I like the "out-loud" readability of "in"; it does 
exactly what it says, and means the exact same thing as the discrete 
math "in" membership test operator).... I do agree that as much as 
possible should be kept in the library.  But I'm thinking only 
linear/binary search (based upon the "sortedness" of the array) would 
have to be added internally.  Custom containers (like you mentioned that 
would be coming in DTL) would implement their own "opIn", so that would 
definately be kept totally in the standard library, and of course 
equality testing would be delegated as well.  I think this needs more 
thought, as there is a fine line between having too much built-in and 
not enough.  My main point is this though: why have an operator that can 
only be used on _one_ type of container?

Les Baker

Mar 18 2006

Stewart Gordon <smjg_1998 yahoo.com> writes:

Les Baker wrote:
 Looking through old NG posts I found this one by Ben Hinkle last year 
 about extending the "in" operator to support static/dynamic arrays.
 
 http://www.digitalmars.com/d/archives/digitalmars/D/25164

<snip>

There are two problems with that proposal:

1. At the moment, the only use of the in operator is to determine 
whether a key is present in an AA.  Logically, therefore, for linear arrays,

     x in y

should report on whether x is an index within the bounds of y, i.e.

     x >= 0 && x < y.length

This has been talked about before:

http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/6082


2. The result of the in operator is designed to be directly usable as a 
boolean value and to do what it says on the tin when used as such. 
Making it return an index, instead of a pointer as in for AAs does, 
screws this up totally.  As such, under Ben's proposal

     if (x in y)

would be equivalent to

     if (y.length != 0 && y[0] != x)

which is well and truly counter-intuitive.

Stewart.

-- 
-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
GCS/M d- s:- C++  a->--- UB  P+ L E  W++  N+++ o K-  w++  O? M V? PS- 
PE- Y? PGP- t- 5? X? R b DI? D G e++>++++ h-- r-- !y
------END GEEK CODE BLOCK------

My e-mail is valid but not my primary mailbox.  Please keep replies on 
the 'group where everyone may benefit.

Mar 20 2006

Ivan Senji <ivan.senji_REMOVE_ _THIS__gmail.com> writes:

Stewart Gordon wrote:
 Les Baker wrote:
 Looking through old NG posts I found this one by Ben Hinkle last year
 about extending the "in" operator to support static/dynamic arrays.

 http://www.digitalmars.com/d/archives/digitalmars/D/25164

 <snip>
 
 There are two problems with that proposal:

To reply or not to reply :)

I see that there are two groups of people when it comes to in and
arrays. Much like there are in the bool story.

 
 1. At the moment, the only use of the in operator is to determine
 whether a key is present in an AA.  Logically, therefore, for linear
 arrays,
 
     x in y
 
 should report on whether x is an index within the bounds of y, i.e.

One group thinks this is the way it should work.

 
     x >= 0 && x < y.length

But that doesn't make any sense. A syntax sugar for this is not needed
because this expression is not that hard to write.

AAs and normal arrays are both called arrays but I see them as
completely different things.

Let's say I'm implementing a word counting program in D.
I could do it like this:

int[char[]] words;
Then for each word words[word]++;


But if I wanted to use normal arrays the one storing words would be:

char[] [] words;
int[] count;

Now I have a word and a common thing would be to find out if it is in
'words'.

I would like to be able to simply write something like:

if( (auto x = word in words) == -1 )
{
  words.length = words.length ++;
  count.length = count.length ++;
}
else
{
  count[x]++;
}

The version of 'in' that would be testing arrays index for range would
never (IMO) be useful/used.


So the problem here is the key-value difference in array-AA.

Arrays:
Value - the interesting part
Key - index, usually not that important

AAs:
Key - the interesting part
Value - some additional information about my key, like number of words,
or any other information associated with the interesting part.


I am interested in the interesting part when it comes both to arrays and
AAs.


But I also understand why Walter is probably never going to add this
potentially useful feature: There will always be those that disagree.



 
 This has been talked about before:
 
 http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/6082
 
 
 2. The result of the in operator is designed to be directly usable as a
 boolean value and to do what it says on the tin when used as such.
 Making it return an index, instead of a pointer as in for AAs does,
 screws this up totally.  As such, under Ben's proposal
 
     if (x in y)
 
 would be equivalent to
 
     if (y.length != 0 && y[0] != x)
 

I don't understand. Are you sure?
Isn't what Ben was suggesting something like:

if(x in y)

is the same as:

int index=-1;

for(int i=0; i<y.length; i++)
{
  if(y[i]==x) {index = i; break;}
}


 which is well and truly counter-intuitive.

Sure is.

But what I wrote above makes sense? Doesn't it? Isn't that what everyone
would expect in for arrays to do?

Actually I don't care that much if it return a pointer or an index.
Pointer might be a bit more useful.

y[x in y] ++; vs. (*(x in y)) ++;


Although this is the way I would expect things to work there seem to be
those thinking that there should be a way to check if there is a certain
value in the associative array. Is this really needed?

Do I really want to know which word has a certain frequency in a text?
Actually I had to do something like that once but I was able to do it by
constructing a reverse AA (For example char[] [int]),
but searching for values in AAs doesn't seem as interesting and useful
as searching for values in normal arrays?

Mar 21 2006

Stewart Gordon <smjg_1998 yahoo.com> writes:

Ivan Senji wrote:
<snip>
 Let's say I'm implementing a word counting program in D.
 I could do it like this:
 
 int[char[]] words;
 Then for each word words[word]++;
 
 But if I wanted to use normal arrays the one storing words would be:

Why would you want to make life harder by using normal arrays instead?

 char[] [] words;
 int[] count;
 
 Now I have a word and a common thing would be to find out if it is in
 'words'.
 
 I would like to be able to simply write something like:
 
 if((auto x = word in words) == -1 )

std.string.find already has these semantics.  How about having a version 
of this function for other array types?

 {
   words.length = words.length ++;

That strikes me as undefined behaviour.  Either it's a complete no-op 
(increment words.length and then reassign the original value) or it'll 
assign words.length to itself and then increment it.

   count.length = count.length ++;
 }

<snip>
 But what I wrote above makes sense? Doesn't it? Isn't that what everyone
 would expect in for arrays to do?

When people discover that in works on LAs as well as AAs, they will 
expect the result to be equally usable directly as a boolean value.

 Actually I don't care that much if it return a pointer or an index.
 Pointer might be a bit more useful.
 
 y[x in y] ++; vs. (*(x in y)) ++;

Indeed.

 Although this is the way I would expect things to work there seem to be
 those thinking that there should be a way to check if there is a certain
 value in the associative array. Is this really needed?

I guess it might have a generic programming advantage in templates that 
can work on either kind of an array.  But can anyone think of a 
practical example?

 Do I really want to know which word has a certain frequency in a text?

That's not the only purpose.  It could be used to implement 
bidirectional mappings in general.

 Actually I had to do something like that once but I was able to do it by
 constructing a reverse AA (For example char[] [int]),

<snip>

Yes, that's a way to do bidirectional mappings that might be more 
efficient than value searching at the moment.  Yet another approach 
would be a single data structure with two hash tables.

What you say makes sense on the whole.  But I recall it being stated 
somewhere in the docs (I forget where) that every operator has an 
intended meaning that should be taken into account when overloading. 
Language builtins in particular have a moral duty to set a good example. 
  It's true that the current meaning of in and that being proposed have 
something in common, namely that they check for containment of the what 
you consider the "interesting" feature in each case.  But "interesting" 
is a highly subjective concept, varying both from person to person and 
from application to application, and you could argue either way on 
whether a given judgement of "interesting" is an acceptable basis of 
what the overloaded uses of one operator have in common.

Stewart.

-- 
-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
GCS/M d- s:-  C++  a->--- UB  P+ L E  W++  N+++ o K-  w++  O? M V? PS- 
PE- Y? PGP- t- 5? X? R b DI? D G e++>++++ h-- r-- !y
------END GEEK CODE BLOCK------

My e-mail is valid but not my primary mailbox.  Please keep replies on 
the 'group where everyone may benefit.

Mar 22 2006

D Programming

C/C++ Programming

Other

digitalmars.D - in operator generalization