digitalmars.D.learn - About the in expression, Why can't use with array.

lili (3/3) Oct 24 2019 Hi:

Paul Backus (4/7) Oct 24 2019 Checking for the presence of an item in an array requires a

Jonathan M Davis (16/24) Oct 24 2019 In particular, the reason that linear search is considered unacceptable ...

lili (7/48) Oct 24 2019 This reason somewhat farfetched， Think about that if this reason

Jonathan M Davis (17/69) Oct 24 2019 std.container specifically discusses the expected, worse-case complexity...

=?UTF-8?Q?Ali_=c3=87ehreli?= (13/16) Oct 24 2019 In addition to the big O surprise, there is another important issue that...

Dennis (16/19) Oct 25 2019 No it isn't. Worst case lookup of an associative array lookup is

Paul Backus (3/6) Oct 25 2019 Worth noting that `opIn` is a D1 operator overload, and thus

mipri (18/24) Oct 25 2019 That wasn't mentioned though? Does it reverse the arguments so

Paul Backus (3/13) Oct 25 2019 I made a mistake reading the language spec and missed `in` in the

=?UTF-8?Q?Ali_=c3=87ehreli?= (20/23) Oct 25 2019 Agreed and I've just realized a benefit of the 'in' operator for arrays,...

Dennis (25/30) Oct 25 2019 I'm personally not concerned about generic code.

IGotD- (9/40) Oct 25 2019 I'm also not so fond of that "in" operator returns a pointer

lili (4/49) Oct 25 2019 I argee with your talk, lanuage designer need keep consistent

IGotD- (15/18) Oct 25 2019 I don't see much of a problem why this couldn't be implemented as

lili <akozhao tencent.com> writes:

Hi:
    In Dlang where is strange design. The in expression can only 
use to associative array, why array can not use in expression.

Oct 24 2019

Paul Backus <snarwin gmail.com> writes:

On Thursday, 24 October 2019 at 12:58:11 UTC, lili wrote:
 Hi:
    In Dlang where is strange design. The in expression can only 
 use to associative array, why array can not use in expression.

Checking for the presence of an item in an array requires a 
linear search. You can do it with std.algorithm.searching.canFind:

https://dlang.org/phobos/std_algorithm_searching.html#.canFind

Oct 24 2019

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Thursday, October 24, 2019 7:04:56 AM MDT Paul Backus via Digitalmars-d-
learn wrote:
 On Thursday, 24 October 2019 at 12:58:11 UTC, lili wrote:
 Hi:
    In Dlang where is strange design. The in expression can only

 use to associative array, why array can not use in expression.

 Checking for the presence of an item in an array requires a
 linear search. You can do it with std.algorithm.searching.canFind:

 https://dlang.org/phobos/std_algorithm_searching.html#.canFind

In particular, the reason that linear search is considered unacceptable for
in is so that generic code can rely on its performance. The idea is that
types shouldn't implement the in operator unless they can do so with
O(log n) or better (O(log n) being what it costs to get to an item in a
balanced binary tree like a red-black tree). That way, when you calculate
the complexity of any algorithm using in, you can assume that it's O(log n)
at worst. Having it be O(n) instead (like it would be for an array) would
drastically increase the complexity of the algorithm and make it take much
longer when processing a large number of items. And the standard library
provides functions like canFind or find for finding elements in an array, so
having in work with arrays wouldn't add any functionality. It would
basically just change the syntax you'd use for finding an element in an
array.

- Jonathan M Davis

Oct 24 2019

lili <akozhao tencent.com> writes:

On Thursday, 24 October 2019 at 22:40:31 UTC, Jonathan M Davis 
wrote:
 On Thursday, October 24, 2019 7:04:56 AM MDT Paul Backus via 
 Digitalmars-d- learn wrote:
 On Thursday, 24 October 2019 at 12:58:11 UTC, lili wrote:
 Hi:
    In Dlang where is strange design. The in expression can 
 only

 use to associative array, why array can not use in 
 expression.

 Checking for the presence of an item in an array requires a 
 linear search. You can do it with 
 std.algorithm.searching.canFind:

 https://dlang.org/phobos/std_algorithm_searching.html#.canFind

 In particular, the reason that linear search is considered 
 unacceptable for
 in is so that generic code can rely on its performance. The 
 idea is that
 types shouldn't implement the in operator unless they can do so 
 with
 O(log n) or better (O(log n) being what it costs to get to an 
 item in a
 balanced binary tree like a red-black tree). That way, when you 
 calculate
 the complexity of any algorithm using in, you can assume that 
 it's O(log n)
 at worst. Having it be O(n) instead (like it would be for an 
 array) would
 drastically increase the complexity of the algorithm and make 
 it take much
 longer when processing a large number of items. And the 
 standard library
 provides functions like canFind or find for finding elements in 
 an array, so
 having in work with arrays wouldn't add any functionality. It 
 would
 basically just change the syntax you'd use for finding an 
 element in an
 array.

 - Jonathan M Davis

  This reason somewhat farfetched， Think about that if this reason 
is right， the operator overload can not use。 because same 
operator on different type expression same mean but different 
implementation and complexity。 so why operator overload can don't 
care about implementation but 'in' operator need。

Oct 24 2019

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Thursday, October 24, 2019 8:40:59 PM MDT lili via Digitalmars-d-learn 
wrote:
 On Thursday, 24 October 2019 at 22:40:31 UTC, Jonathan M Davis

 wrote:
 On Thursday, October 24, 2019 7:04:56 AM MDT Paul Backus via

 Digitalmars-d- learn wrote:
 On Thursday, 24 October 2019 at 12:58:11 UTC, lili wrote:
 Hi:
    In Dlang where is strange design. The in expression can

 only

 use to associative array, why array can not use in
 expression.

 Checking for the presence of an item in an array requires a
 linear search. You can do it with
 std.algorithm.searching.canFind:

 https://dlang.org/phobos/std_algorithm_searching.html#.canFind

 In particular, the reason that linear search is considered
 unacceptable for
 in is so that generic code can rely on its performance. The
 idea is that
 types shouldn't implement the in operator unless they can do so
 with
 O(log n) or better (O(log n) being what it costs to get to an
 item in a
 balanced binary tree like a red-black tree). That way, when you
 calculate
 the complexity of any algorithm using in, you can assume that
 it's O(log n)
 at worst. Having it be O(n) instead (like it would be for an
 array) would
 drastically increase the complexity of the algorithm and make
 it take much
 longer when processing a large number of items. And the
 standard library
 provides functions like canFind or find for finding elements in
 an array, so
 having in work with arrays wouldn't add any functionality. It
 would
 basically just change the syntax you'd use for finding an
 element in an
 array.

 - Jonathan M Davis

   This reason somewhat farfetched， Think about that if this reason
 is right， the operator overload can not use。 because same
 operator on different type expression same mean but different
 implementation and complexity。 so why operator overload can don't
 care about implementation but 'in' operator need。

std.container specifically discusses the expected, worse-case complexity of
(i.e. Big-O complexity) all of the various container-related functions -
including the in operator:

https://dlang.org/phobos/std_container.html

As such, anyone writing an algorithm using any of those functions can rely
on the worst-case complexity of each operation and accurately calculate the
overall, worst-case complexity of their algorithm. Sure, someone could
choose to overload the in operator in a manner which does not follow those
rules, but it's considered bad practice to do so, and no official
implementation of any kind is going to provide a function with a worse
complexity than what std.container lays out - and that includes operators
for the built-in types like a dynamic array. This is basically the same
approach that C++ takes with its STL containers and their related
operations.

- Jonathan M Davis

Oct 24 2019

=?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:

On 10/24/2019 05:58 AM, lili wrote:
 Hi:
     In Dlang where is strange design. The in expression can only use to
 associative array, why array can not use in expression.

In addition to the big O surprise, there is another important issue that 
may be seen as either for or against supporting the 'in' operator for 
arrays: With associative arrays, 'in' takes the keys; since keys are 
index values for arrays, the in operator would be trivially


   return (key < __length
           ? &__ptr[key]
           : null);

Given the two differences, 'in' does not make sense on arrays:

- Big O is different

- One wants to use keys for associative arrays but (likely) values for 
arrays

Ali

Oct 24 2019

Dennis <dkorpel gmail.com> writes:

On Friday, 25 October 2019 at 05:17:35 UTC, Ali Çehreli wrote:
 - Big O is different

No it isn't. Worst case lookup of an associative array lookup is 
O(n) too. It can easily be 'achieved' by having a key type with:
```
size_t toHash() const scope pure {
     return 0;
}
```

The fact that std.container specifies a worst-case time 
complexity for operations on its data structures doesn't mean 
every other type has to comply to those too.
I can overload the 'in' operator on my types to something that 
takes exponential time if I want, just like "+" can also be 
overloaded to a linear time operation on e.g. BigInt.

 - One wants to use keys for associative arrays but (likely) 
 values for arrays

Then we implement `in` to look for values in arrays, just like 
users would expect.

Oct 25 2019

Paul Backus <snarwin gmail.com> writes:

On Friday, 25 October 2019 at 09:25:21 UTC, Dennis wrote:
 I can overload the 'in' operator on my types to something that 
 takes exponential time if I want, just like "+" can also be 
 overloaded to a linear time operation on e.g. BigInt.

Worth noting that `opIn` is a D1 operator overload, and thus 
deprecated as of 2.088.0.

Oct 25 2019

mipri <mipri minimaltype.com> writes:

On Friday, 25 October 2019 at 15:52:50 UTC, Paul Backus wrote:
 On Friday, 25 October 2019 at 09:25:21 UTC, Dennis wrote:
 I can overload the 'in' operator on my types to something that 
 takes exponential time if I want, just like "+" can also be 
 overloaded to a linear time operation on e.g. BigInt.

 Worth noting that `opIn` is a D1 operator overload, and thus 
 deprecated as of 2.088.0.

That wasn't mentioned though? Does it reverse the arguments so
that the writeln() examples below can be more naturally ordered?


   import std.stdio;

   struct X {
       string data;

       bool opBinary(string op)(char p) if (op == "in") {
           foreach (c; data)
               if (p == c) return true;
           return false;
       }
   }

   void main() {
       auto x = X("hi there");
       writeln(x in 'i');
       writeln(x in '?');
   }

Oct 25 2019

Paul Backus <snarwin gmail.com> writes:

On Friday, 25 October 2019 at 16:01:57 UTC, mipri wrote:
 On Friday, 25 October 2019 at 15:52:50 UTC, Paul Backus wrote:
 On Friday, 25 October 2019 at 09:25:21 UTC, Dennis wrote:
 I can overload the 'in' operator on my types to something 
 that takes exponential time if I want, just like "+" can also 
 be overloaded to a linear time operation on e.g. BigInt.

 Worth noting that `opIn` is a D1 operator overload, and thus 
 deprecated as of 2.088.0.

 That wasn't mentioned though? Does it reverse the arguments so
 that the writeln() examples below can be more naturally ordered?

I made a mistake reading the language spec and missed `in` in the 
list of operators covered by opBinary. Please disregard!

Oct 25 2019

=?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:

On 10/25/2019 02:25 AM, Dennis wrote:
 On Friday, 25 October 2019 at 05:17:35 UTC, Ali =C3=87ehreli wrote:
 - Big O is different

 No it isn't.

Agreed and I've just realized a benefit of the 'in' operator for arrays, =

which I haven't heard before. (I don't follow all discussions in=20
detail.) 'in' can

- linear search on regular arrays

- binary search on SortedRanges

So, this could would better for the programmer. 'in' would pick the=20
fastest option without ever being slower than what the programmer would=20
do. (The programmer can call canFind() on a regular but otherwise sorted =

array.) However, the following gotcha is still possible:

   auto sorted =3D sort(arr);
   if (element in arr) { // Oops; actually meant 'element in sorted'
     // ...
   }

I'm still not completely sold on the whole idea though because it's not=20
a clear win.

Do others see other advantages in other places like templates? For=20
example, could templates really be written generically for arrays and=20
associative arrays?

Ali

Oct 25 2019

Dennis <dkorpel gmail.com> writes:

On Friday, 25 October 2019 at 19:49:05 UTC, Ali Çehreli wrote:
 I'm still not completely sold on the whole idea though because 
 it's not a clear win.

 Do others see other advantages in other places like templates? 
 For example, could templates really be written generically for 
 arrays and associative arrays?

I'm personally not concerned about generic code.
- The semantics of `in` aren't very well defined anyways.
- Those who write templates like that (hopefully) know what 
they're doing, they'll figure it out  ;)
- I can't think of situations where you actually want to write 
code that generically works on both arrays and associative arrays 
like that. (Though if anyone knows one, please share, I'm 
interested.)

I'm more concerned about the repeated reports of users being 
surprised that `in` doesn't work like they expect. In Python, the 
expression `3 in [2, 3, 4]` returns a boolean, and in D you can 
do `bool b = 15 in iota(10, 20)` because the operator is 
overloaded in Phobos for iota; But as far as actual language 
support, `in` is only defined for associative arrays:

https://dlang.org/spec/expression.html#InExpression

It returns a pointer which can be used in an if-statement, but 
also to read/modify the value. So should `in` on arrays do the 
same? It would be consistent, but usage of raw pointers is 
discouraged with the advent of scope and ref etc. Also you can't 
implicitly convert a pointer to a boolean.
Should it be a boolean then? That means part of the result of the 
linear search is discarded, making `in` less flexible.
So maybe we should leave it for now, and put a small explanation 
in the error message.

Oct 25 2019

IGotD- <nise nise.com> writes:

On Friday, 25 October 2019 at 20:44:18 UTC, Dennis wrote:
 On Friday, 25 October 2019 at 19:49:05 UTC, Ali Çehreli wrote:
 I'm still not completely sold on the whole idea though because 
 it's not a clear win.

 Do others see other advantages in other places like templates? 
 For example, could templates really be written generically for 
 arrays and associative arrays?

 I'm personally not concerned about generic code.
 - The semantics of `in` aren't very well defined anyways.
 - Those who write templates like that (hopefully) know what 
 they're doing, they'll figure it out  ;)
 - I can't think of situations where you actually want to write 
 code that generically works on both arrays and associative 
 arrays like that. (Though if anyone knows one, please share, 
 I'm interested.)

 I'm more concerned about the repeated reports of users being 
 surprised that `in` doesn't work like they expect. In Python, 
 the expression `3 in [2, 3, 4]` returns a boolean, and in D you 
 can do `bool b = 15 in iota(10, 20)` because the operator is 
 overloaded in Phobos for iota; But as far as actual language 
 support, `in` is only defined for associative arrays:

 https://dlang.org/spec/expression.html#InExpression

 It returns a pointer which can be used in an if-statement, but 
 also to read/modify the value. So should `in` on arrays do the 
 same? It would be consistent, but usage of raw pointers is 
 discouraged with the advent of scope and ref etc. Also you 
 can't implicitly convert a pointer to a boolean.
 Should it be a boolean then? That means part of the result of 
 the linear search is discarded, making `in` less flexible.
 So maybe we should leave it for now, and put a small 
 explanation in the error message.

I'm also not so fond of that "in" operator returns a pointer 
which is bad fit for arrays and possibly many other container 
algorithms as well. I think the best would be that it would 
return an optional of type T and T would be user defined. For 
associative arrays, a pointer or a reference to the element. For 
arrays the index of that element would suitable. Also overloading 
would be nice so that "in" operator could return several possible 
optional types, not sure if that would be possible though.

Oct 25 2019

lili <akozhao tencent.com> writes:

On Friday, 25 October 2019 at 21:06:53 UTC, IGotD- wrote:
 On Friday, 25 October 2019 at 20:44:18 UTC, Dennis wrote:
 On Friday, 25 October 2019 at 19:49:05 UTC, Ali Çehreli wrote:
 I'm still not completely sold on the whole idea though 
 because it's not a clear win.

 Do others see other advantages in other places like 
 templates? For example, could templates really be written 
 generically for arrays and associative arrays?

 I'm personally not concerned about generic code.
 - The semantics of `in` aren't very well defined anyways.
 - Those who write templates like that (hopefully) know what 
 they're doing, they'll figure it out  ;)
 - I can't think of situations where you actually want to write 
 code that generically works on both arrays and associative 
 arrays like that. (Though if anyone knows one, please share, 
 I'm interested.)

 I'm more concerned about the repeated reports of users being 
 surprised that `in` doesn't work like they expect. In Python, 
 the expression `3 in [2, 3, 4]` returns a boolean, and in D 
 you can do `bool b = 15 in iota(10, 20)` because the operator 
 is overloaded in Phobos for iota; But as far as actual 
 language support, `in` is only defined for associative arrays:

 https://dlang.org/spec/expression.html#InExpression

 It returns a pointer which can be used in an if-statement, but 
 also to read/modify the value. So should `in` on arrays do the 
 same? It would be consistent, but usage of raw pointers is 
 discouraged with the advent of scope and ref etc. Also you 
 can't implicitly convert a pointer to a boolean.
 Should it be a boolean then? That means part of the result of 
 the linear search is discarded, making `in` less flexible.
 So maybe we should leave it for now, and put a small 
 explanation in the error message.

 I'm also not so fond of that "in" operator returns a pointer 
 which is bad fit for arrays and possibly many other container 
 algorithms as well. I think the best would be that it would 
 return an optional of type T and T would be user defined. For 
 associative arrays, a pointer or a reference to the element. 
 For arrays the index of that element would suitable. Also 
 overloading would be nice so that "in" operator could return 
 several possible optional types, not sure if that would be 
 possible though.

I argee with your talk, lanuage designer need keep consistent 
that the 'in' semantics. or not support 'in' key word. Don't 
assumption that how user use it.

Oct 25 2019

IGotD- <nise nise.com> writes:

On Thursday, 24 October 2019 at 12:58:11 UTC, lili wrote:
 Hi:
    In Dlang where is strange design. The in expression can only 
 use to associative array, why array can not use in expression.

I don't see much of a problem why this couldn't be implemented as 
long as the user understands the limitations of it. In real life, 
linear searches are often completely acceptable when the number 
of items are in a few hundreds or less of them. Depends on the 
use case of course.

Many has answered because of search complexity but one thing is 
that use case with associative arrays is to have unique keys. 
This is not the case for arrays which allows duplicates which not 
to seldom the case. Naturally an implementation would take the 
first hit and return that object. Why it isn't implemented for 
arrays, I suspect it has to do more that the search can be 
ambiguous as the array allows duplicates. Still I think a linear 
search in an array should be implemented based on a first hit 
search.

Oct 25 2019

D Programming

C/C++ Programming

Other

digitalmars.D.learn - About the in expression, Why can't use with array.