www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - 'in' syntax for searching arrays

reply Denton Cockburn <diboss hotmail.com> writes:
What would be the problem in extending the 'in' syntax used for AA to
regular arrays?

I think it's rather nice syntactic sugar.
Nov 03 2007
parent reply Bill Baxter <dnewsgroup billbaxter.com> writes:
Denton Cockburn wrote:
 What would be the problem in extending the 'in' syntax used for AA to
 regular arrays?
 
 I think it's rather nice syntactic sugar.
What it should mean. See the series of arguments and counter-arguments in the enh request I filed: http://d.puremagic.com/issues/show_bug.cgi?id=1323 So which meaning where you hoping it would have? --bb
Nov 03 2007
parent reply Denton Cockburn <diboss hotmail.com> writes:
On Sun, 04 Nov 2007 13:42:41 +0900, Bill Baxter wrote:

 Denton Cockburn wrote:
 What would be the problem in extending the 'in' syntax used for AA to
 regular arrays?
 
 I think it's rather nice syntactic sugar.
What it should mean. See the series of arguments and counter-arguments in the enh request I filed: http://d.puremagic.com/issues/show_bug.cgi?id=1323 So which meaning where you hoping it would have? --bb
search by values, not by index. I wrote a contains function, and I assume everyone who comes across a similar use-case does the same. I just figure it's done often enough that adding it wouldn't be hard, especially considering the keyword is already there, and with the same general meaning.
Nov 03 2007
next sibling parent reply Bill Baxter <dnewsgroup billbaxter.com> writes:
Denton Cockburn wrote:
 On Sun, 04 Nov 2007 13:42:41 +0900, Bill Baxter wrote:
 
 Denton Cockburn wrote:
 What would be the problem in extending the 'in' syntax used for AA to
 regular arrays?

 I think it's rather nice syntactic sugar.
What it should mean. See the series of arguments and counter-arguments in the enh request I filed: http://d.puremagic.com/issues/show_bug.cgi?id=1323 So which meaning where you hoping it would have? --bb
search by values, not by index. I wrote a contains function, and I assume everyone who comes across a similar use-case does the same. I just figure it's done often enough that adding it wouldn't be hard, especially considering the keyword is already there, and with the same general meaning.
Yeh, well that's what I think it should mean, and that's what Python thinks it should mean. But not everyone agrees, and Walter doesn't seem interested. --bb
Nov 03 2007
next sibling parent reply "Janice Caron" <caron800 googlemail.com> writes:
How coincidental. Only yesterday (on another thread) I posted my
thought that it would be cool to have a std.array module with lots of
array functions, with as much power as those of PHP (though not with
the same syntax!)

My own thought is that looking for values in arrays should have the
same syntax as strings. That is, for non-associative arrays:

V[] a;
int n = a.find(x);

should return -1 if the value is not present in the array, or the
index if it is. This is a much better idea than returning a pointer to
the element, because (a) you don't have to worry about const and
invariant, and (b) it's the same as std.string. In addition, you can
then do a[n..$].find() to find the second occurence, and so on.

For associative arrays, it's trickier to decide on the syntax. My
first thought was

V[K] a;
K n = a.find(x); // return a key whose value is x

but that doesn't really work so well. First, there's no generic
equivalent to -1, and hence no way to say "no match". Second, there's
no real way of searching for the second match, once you've found the
first. So then, my second thought was, maybe it should be

V[K] a;
K[] n = a.findAll(x); // return an array of all keys whose value is x

PHP has a function which "filps" an array, so that keys become values
and values become keys. That would be useful too.
Nov 04 2007
next sibling parent reply Matti Niemenmaa <see_signature for.real.address> writes:
Janice Caron wrote:
 V[] a;
 int n = a.find(x);
 
 should return -1 if the value is not present in the array, or the
 index if it is. This is a much better idea than returning a pointer to
 the element, because (a) you don't have to worry about const and
 invariant, and (b) it's the same as std.string. In addition, you can
 then do a[n..$].find() to find the second occurence, and so on.
As an aside: Tango uses the idiom that find returns the length of the given array if no match is found. This allows it to return a size_t (if you ever use arrays longer than int.max...) whilst still having a clear error value. -- E-mail address: matti.niemenmaa+news, domain is iki (DOT) fi
Nov 04 2007
parent Oskar Linde <oskar.lindeREM OVEgmail.com> writes:
Matti Niemenmaa wrote:
 Janice Caron wrote:
 V[] a;
 int n = a.find(x);

 should return -1 if the value is not present in the array, or the
 index if it is. This is a much better idea than returning a pointer to
 the element, because (a) you don't have to worry about const and
 invariant, and (b) it's the same as std.string. In addition, you can
 then do a[n..$].find() to find the second occurence, and so on.
As an aside: Tango uses the idiom that find returns the length of the given array if no match is found. This allows it to return a size_t (if you ever use arrays longer than int.max...) whilst still having a clear error value.
It is also helpful that arr.length is a valid slice delimiter, so that you with Tango: a[a.find(whatever)..$] and a[0..a.find(something)] remain valid even when no match is found. -- Oskar
Nov 04 2007
prev sibling parent Regan Heath <regan netmail.co.nz> writes:
Janice Caron wrote:
 How coincidental. Only yesterday (on another thread) I posted my
 thought that it would be cool to have a std.array module with lots of
 array functions, with as much power as those of PHP (though not with
 the same syntax!)
 
 My own thought is that looking for values in arrays should have the
 same syntax as strings. That is, for non-associative arrays:
 
 V[] a;
 int n = a.find(x);
 
 should return -1 if the value is not present in the array, or the
 index if it is. This is a much better idea than returning a pointer to
 the element, because (a) you don't have to worry about const and
 invariant, and (b) it's the same as std.string. In addition, you can
 then do a[n..$].find() to find the second occurence, and so on.
You do have to remember to add the result of the 2nd call to the result of the first to get the index of the 2nd occurrence though. I've forgotten to do it almost as many times as I've coded something using this. Regan
Nov 05 2007
prev sibling next sibling parent "Janice Caron" <caron800 googlemail.com> writes:
On 11/4/07, Janice Caron <caron800 googlemail.com> wrote:
 My own thought is that looking for values in arrays should have the
 same syntax as strings. That is, for non-associative arrays:

 V[] a;
 int n = a.find(x);

 should return -1 if the value is not present in the array, or the
 index if it is.
In point of fact, if find() were defined for arrays of any kind, instead of just char[] (or now, "const const char[]" (whatever that means), then the string functions would almost be subsumed. (I say almost, because the string functions search for a dchar, not a char, and hence do UTF-8 decoding) Of course I'd also like to see the std.string functions extended to include wstrings and dstrings, but that's off-topic for this thread.
 So then, my second thought was, maybe it should be

 V[K] a;
 K[] n = a.findAll(x); // return an array of all keys whose value is x
My third thought is a function findEach() which returns a class which overloads foreach, so you could do V[K] a; foreach (key; a.findEach(value)) {} But again, all I'm talking about here are library functions, not language extentions.
Nov 04 2007
prev sibling parent "Janice Caron" <caron800 googlemail.com> writes:
On 11/4/07, Janice Caron <caron800 googlemail.com> wrote:
 My third thought is a function findEach() which returns a class which
 overloads foreach, so you could do

 V[K] a;
 foreach (key; a.findEach(value)) {}
Ooh - on fourth thoughts, I'd like to rename that function "each". As in foreach (key; a.each(value)) { /*do something*/ } I like that.
Nov 04 2007
prev sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Denton Cockburn wrote:
 I just figure it's done often enough that adding it wouldn't be hard,
 especially considering the keyword is already there, and with the same
 general meaning.
With AAs, the 'in' expression looks for the key, not the value. It would be inconsistent to overload it to search for values in regular arrays.
Nov 04 2007
next sibling parent reply Bill Baxter <dnewsgroup billbaxter.com> writes:
Walter Bright wrote:
 Denton Cockburn wrote:
 I just figure it's done often enough that adding it wouldn't be hard,
 especially considering the keyword is already there, and with the same
 general meaning.
With AAs, the 'in' expression looks for the key, not the value. It would be inconsistent to overload it to search for values in regular arrays.
A foolish consistency. Arrays and associative arrays are different things. --bb
Nov 04 2007
parent reply Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
Bill Baxter wrote:
 Walter Bright wrote:
 Denton Cockburn wrote:
 I just figure it's done often enough that adding it wouldn't be hard,
 especially considering the keyword is already there, and with the same
 general meaning.
With AAs, the 'in' expression looks for the key, not the value. It would be inconsistent to overload it to search for values in regular arrays.
A foolish consistency. Arrays and associative arrays are different things. --bb
It's inconsistent either way, for that same reason! (that arrays and associative arrays are different things). 'in' should not be overloaded in any way whatsoever between those two types. Neither to search for indexes or values in regular arrays. Arrays and "associative arrays" are *very* different things. They only share the "array" name and the aspect that they are indexable. But their semantics are very different. Arrays have a lot more in common with lists than "associative arrays". Arrays and lists should have appropriately named methods such as 'contains', 'sort', 'slice', etc., while "associative arrays" should have other methods such as 'containsKey', 'containsValue', etc. ('slice' and others don't even make sense). And btw, which languages besides D call associative arrays as "associative arrays"? (instead of "maps" or other things more sensible) -- Bruno Medeiros - MSc in CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Nov 05 2007
next sibling parent Ary Borenszweig <ary esperanto.org.ar> writes:
Bruno Medeiros escribió:
 Bill Baxter wrote:
 Walter Bright wrote:
 Denton Cockburn wrote:
 I just figure it's done often enough that adding it wouldn't be hard,
 especially considering the keyword is already there, and with the same
 general meaning.
With AAs, the 'in' expression looks for the key, not the value. It would be inconsistent to overload it to search for values in regular arrays.
A foolish consistency. Arrays and associative arrays are different things. --bb
It's inconsistent either way, for that same reason! (that arrays and associative arrays are different things). 'in' should not be overloaded in any way whatsoever between those two types. Neither to search for indexes or values in regular arrays. Arrays and "associative arrays" are *very* different things. They only share the "array" name and the aspect that they are indexable. But their semantics are very different. Arrays have a lot more in common with lists than "associative arrays". Arrays and lists should have appropriately named methods such as 'contains', 'sort', 'slice', etc., while "associative arrays" should have other methods such as 'containsKey', 'containsValue', etc. ('slice' and others don't even make sense). And btw, which languages besides D call associative arrays as "associative arrays"? (instead of "maps" or other things more sensible)
PHP: http://ar.php.net/array
Nov 05 2007
prev sibling parent reply Regan Heath <regan netmail.co.nz> writes:
Bruno Medeiros wrote:
 Bill Baxter wrote:
 Walter Bright wrote:
 Denton Cockburn wrote:
 I just figure it's done often enough that adding it wouldn't be hard,
 especially considering the keyword is already there, and with the same
 general meaning.
With AAs, the 'in' expression looks for the key, not the value. It would be inconsistent to overload it to search for values in regular arrays.
A foolish consistency. Arrays and associative arrays are different things. --bb
It's inconsistent either way, for that same reason! (that arrays and associative arrays are different things). 'in' should not be overloaded in any way whatsoever between those two types. Neither to search for indexes or values in regular arrays. Arrays and "associative arrays" are *very* different things. They only share the "array" name and the aspect that they are indexable. But their semantics are very different. Arrays have a lot more in common with lists than "associative arrays". Arrays and lists should have appropriately named methods such as 'contains', 'sort', 'slice', etc., while "associative arrays" should have other methods such as 'containsKey', 'containsValue', etc. ('slice' and others don't even make sense). And btw, which languages besides D call associative arrays as "associative arrays"? (instead of "maps" or other things more sensible)
A map is a piece of paper or book with roads and rivers drawn on it. <g> Regan
Nov 06 2007
parent Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
Regan Heath wrote:
 
 A map is a piece of paper or book with roads and rivers drawn on it.  <g>
 
 Regan
No, it's a game scenario where a match takes place. :P -- Bruno Medeiros - MSc in CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Nov 06 2007
prev sibling parent reply Derek Parnell <derek psych.ward> writes:
On Sun, 04 Nov 2007 12:13:35 -0800, Walter Bright wrote:

 Denton Cockburn wrote:
 I just figure it's done often enough that adding it wouldn't be hard,
 especially considering the keyword is already there, and with the same
 general meaning.
With AAs, the 'in' expression looks for the key, not the value. It would be inconsistent to overload it to search for values in regular arrays.
No it wouldn't. If someone says "is 'cat' in that list?" nearly everyone knows exactly what was meant. If the list is 'ordered' (i.e. an AA) the person was looking for an index of 'cat', and if the list was not 'ordered' (a regular array) the person was looking for content of 'cat'. For example, I have a dictionary and wonder if 'cat' is in it. I look up the word "cat" instead of scanning every page for the first mention of a cat. I have a newspaper article about dog training and wonder if 'cat' is in it, so I scan for the first mention of a cat. It is not an inconsistant overload. For such an example of inconsistant overloads see "static", or "in" when used with DbC. -- Derek Parnell Melbourne, Australia skype: derek.j.parnell
Nov 04 2007
parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 11/4/07, Derek Parnell <derek psych.ward> wrote:
 If someone says "is 'cat' in that list?" nearly everyone knows exactly what
 was meant.
I'd rather have consistency - particularly when it would be so easy to add a find() function to do the search-for-a-value thing.
Nov 04 2007
parent Daniel Keep <daniel.keep.lists gmail.com> writes:
Janice Caron wrote:
 On 11/4/07, Derek Parnell <derek psych.ward> wrote:
 If someone says "is 'cat' in that list?" nearly everyone knows exactly what
 was meant.
I'd rather have consistency - particularly when it would be so easy to add a find() function to do the search-for-a-value thing.
I think his point was that the double meaning for "in" *is* consistent with how people use the word in every day life. On the other hand, there are lots of keywords in D that have completely non-orthogonal meanings. The consistency argument is, frankly, a load of codswallow when you consider that "in" already has *three* unrelated meanings: "membership test", "precondition" and "pass by value." And it's not the only keyword like that; static has multiple meanings, too. And every other operator in the language can have subtly different meanings depending on how its been overloaded. The important thing is that the operators all do "the right thing" for the given type. An example: you can't use real[4] + real[4], but you *can* do vector!(real, 4) + vector!(real, 4). I mean, they're basically the exact same thing. Thus, this is inconsistent, and you shouldn't be allowed to add vectors. But that's stupid because adding vectors is actually *really* useful, and I'm actually quite capable of putting arrays and vectors into different mental boxes. Arrays vs. maps is no different; in that case you're saying that "in" shouldn't be usable on arrays because it's implemented differently for AAs. Anyway, you can probably tell which side of the fence I'm sitting on. Just thought I'd chip in my AU$0.02. -- Daniel
Nov 04 2007