digitalmars.D.learn - Why function template 'among' is of complexity O(1) and not O(n)?
- Gopan (11/25) Feb 19 2014 Recently, I came across the following thread
- Tobias Pankrath (5/33) Feb 19 2014 It's O(k) with k = vals.length, which is the number of arguments
- Gopan (31/68) Feb 19 2014 I agree that vals.length is known at compile time. But while
- Tobias Pankrath (2/6) Feb 19 2014 O(1) = O(k) for any constant k.
- Dicebot (5/14) Feb 19 2014 I don't think it is legit to speak about k as constant here. It
- Tobias Pankrath (5/20) Feb 19 2014 Yep, the meta-algorithm is O(n) but the specific instance used in
- =?UTF-8?B?QWxpIMOHZWhyZWxp?= (14/77) Feb 19 2014 You get that output only because you have inserted the writeln()
- Dicebot (3/4) Feb 19 2014 This is much better explanation - O(n) compile-time algorithm
- anonymous (6/32) Feb 19 2014 [...]
- Jakob Ovrum (5/8) Feb 19 2014 While this is correct, note that there is also an overload that
- Gopan (23/65) Feb 20 2014 Great !! Thank you Ali. I am convinced.
- anonymous (7/21) Feb 20 2014 When v and vals are known at compile time, that opens up the
- Jakob Ovrum (5/7) Feb 20 2014 That's not really true. The optimizer will never try CTFE because
- Peter Alexander (11/22) Feb 19 2014 Your understanding is probably fine. It's just one of those
- Chris Williams (32/37) Feb 19 2014 That's basically saying that on any linear bouded automaton, all
- anonymous (9/18) Feb 19 2014 That's an optimization the compiler may do, but it's not
- Chris Williams (7/17) Feb 19 2014 I think "expected to do" is probably the better phrasing at the
- anonymous (30/50) Feb 19 2014 The foreach is definitely always evaluated at compile time.
- Chris Williams (2/6) Feb 19 2014 Ah, roger.
- Tobias Pankrath (25/31) Feb 20 2014 That's a useless oversimplification since for all sizes lesser
- Tobias Pankrath (1/5) Feb 20 2014 The application is of course O(n).
Recently, I came across the following thread http://forum.dlang.org/thread/l8nocr$1dv5$1 digitalmars.com?page=2 On Monday, 16 December 2013 at 22:10:52 UTC, Andrei Alexandrescu wrote:On 12/16/13 1:45 PM, Walter Bright wrote:To me, it looks like a Linear Search and hence of complexity O(n). Could somebody please elaborate why it is O(1) and not O(n)? If there is an article/document that I am to read to understand this, please cite it. (I am a beginner level programmer and new to D Language.) Thanks, Gopan.On 12/16/2013 12:38 PM, Andrei Alexandrescu wrote:It's O(1) because the data size is in the source text. There's no variable n.uint among(T, Us...)(T v, Us vals) { foreach (i, U; Us) { if (v == vals[i]) return i + 1; } return 0; }This has O(n) behavior, which might be unexpected for the user.
Feb 19 2014
On Wednesday, 19 February 2014 at 08:58:10 UTC, Gopan wrote:Recently, I came across the following thread http://forum.dlang.org/thread/l8nocr$1dv5$1 digitalmars.com?page=2 On Monday, 16 December 2013 at 22:10:52 UTC, Andrei Alexandrescu wrote:It's O(k) with k = vals.length, which is the number of arguments given to the function, which is a compile time constant and not dependent on any input of the final program. In O(n) the n always relates to the input somehow.On 12/16/13 1:45 PM, Walter Bright wrote:To me, it looks like a Linear Search and hence of complexity O(n). Could somebody please elaborate why it is O(1) and not O(n)? If there is an article/document that I am to read to understand this, please cite it. (I am a beginner level programmer and new to D Language.) Thanks, Gopan.On 12/16/2013 12:38 PM, Andrei Alexandrescu wrote:It's O(1) because the data size is in the source text. There's no variable n.uint among(T, Us...)(T v, Us vals) { foreach (i, U; Us) { if (v == vals[i]) return i + 1; } return 0; }This has O(n) behavior, which might be unexpected for the user.
Feb 19 2014
On Wednesday, 19 February 2014 at 09:09:25 UTC, Tobias Pankrath wrote:On Wednesday, 19 February 2014 at 08:58:10 UTC, Gopan wrote:I agree that vals.length is known at compile time. But while running the application, it iterates through every val in the input list until a match is found. So, how can that be considered O(1)? See my test app and output. uint among(T, Us...)(T v, Us vals) { foreach (i, U; Us) { writeln("checking ", v, " == ", vals[i]); if (v == vals[i]) return i + 1; } return 0; } void main() { uint index = among(3, 1,2,5,3); writeln("Index of 3 in (1,2,5,3) is ", index); } outrput of running the app: checking 3 == 1 checking 3 == 2 checking 3 == 5 checking 3 == 3 Index of 3 in (1,2,5,3) is 4 Or, is my undertanding about Big-O notation of complexity wrong? Thanks, GopanRecently, I came across the following thread http://forum.dlang.org/thread/l8nocr$1dv5$1 digitalmars.com?page=2 On Monday, 16 December 2013 at 22:10:52 UTC, Andrei Alexandrescu wrote:It's O(k) with k = vals.length, which is the number of arguments given to the function, which is a compile time constant and not dependent on any input of the final program. In O(n) the n always relates to the input somehow.On 12/16/13 1:45 PM, Walter Bright wrote:To me, it looks like a Linear Search and hence of complexity O(n). Could somebody please elaborate why it is O(1) and not O(n)? If there is an article/document that I am to read to understand this, please cite it. (I am a beginner level programmer and new to D Language.) Thanks, Gopan.On 12/16/2013 12:38 PM, Andrei Alexandrescu wrote:It's O(1) because the data size is in the source text. There's no variable n.uint among(T, Us...)(T v, Us vals) { foreach (i, U; Us) { if (v == vals[i]) return i + 1; } return 0; }This has O(n) behavior, which might be unexpected for the user.
Feb 19 2014
On Wednesday, 19 February 2014 at 09:46:04 UTC, Gopan wrote:Index of 3 in (1,2,5,3) is 4 Or, is my undertanding about Big-O notation of complexity wrong? Thanks, GopanO(1) = O(k) for any constant k.
Feb 19 2014
On Wednesday, 19 February 2014 at 13:03:38 UTC, Tobias Pankrath wrote:On Wednesday, 19 February 2014 at 09:46:04 UTC, Gopan wrote:I don't think it is legit to speak about k as constant here. It is constant for any specific function instance but not for template meta-algorithm as a whole.Index of 3 in (1,2,5,3) is 4 Or, is my undertanding about Big-O notation of complexity wrong? Thanks, GopanO(1) = O(k) for any constant k.
Feb 19 2014
On Wednesday, 19 February 2014 at 13:11:32 UTC, Dicebot wrote:On Wednesday, 19 February 2014 at 13:03:38 UTC, Tobias Pankrath wrote:Yep, the meta-algorithm is O(n) but the specific instance used in the program is O(k). We need to agree on a) what the variables mean and on b) the artifact whose complexity is of interest.On Wednesday, 19 February 2014 at 09:46:04 UTC, Gopan wrote:I don't think it is legit to speak about k as constant here. It is constant for any specific function instance but not for template meta-algorithm as a whole.Index of 3 in (1,2,5,3) is 4 Or, is my undertanding about Big-O notation of complexity wrong? Thanks, GopanO(1) = O(k) for any constant k.
Feb 19 2014
On 02/19/2014 01:46 AM, Gopan wrote:On Wednesday, 19 February 2014 at 09:09:25 UTC, Tobias Pankrath wrote:You get that output only because you have inserted the writeln() expressions in there. That foreach over variadic template arguments is a "compile-time foreach"[1]. For example, if v were 2 and vals were "1, 2, 3", the original code would be expanded to the following: if (2 == 1) return 1; // i == 0 if (2 == 2) return 2; // i == 1 if (2 == 3) return 3; // i == 3 Since all of those conditions is known to be true or false at compile time, the compiler will reduce the code to the following: return 2; Ali [1] http://ddili.org/ders/d.en/tuples.htmlOn Wednesday, 19 February 2014 at 08:58:10 UTC, Gopan wrote:I agree that vals.length is known at compile time. But while running the application, it iterates through every val in the input list until a match is found. So, how can that be considered O(1)? See my test app and output. uint among(T, Us...)(T v, Us vals) { foreach (i, U; Us) { writeln("checking ", v, " == ", vals[i]); if (v == vals[i]) return i + 1; } return 0; } void main() { uint index = among(3, 1,2,5,3); writeln("Index of 3 in (1,2,5,3) is ", index); } outrput of running the app: checking 3 == 1 checking 3 == 2 checking 3 == 5 checking 3 == 3 Index of 3 in (1,2,5,3) is 4 Or, is my undertanding about Big-O notation of complexity wrong?Recently, I came across the following thread http://forum.dlang.org/thread/l8nocr$1dv5$1 digitalmars.com?page=2 On Monday, 16 December 2013 at 22:10:52 UTC, Andrei Alexandrescu wrote:It's O(k) with k = vals.length, which is the number of arguments given to the function, which is a compile time constant and not dependent on any input of the final program. In O(n) the n always relates to the input somehow.On 12/16/13 1:45 PM, Walter Bright wrote:To me, it looks like a Linear Search and hence of complexity O(n). Could somebody please elaborate why it is O(1) and not O(n)? If there is an article/document that I am to read to understand this, please cite it. (I am a beginner level programmer and new to D Language.) Thanks, Gopan.On 12/16/2013 12:38 PM, Andrei Alexandrescu wrote:It's O(1) because the data size is in the source text. There's no variable n.uint among(T, Us...)(T v, Us vals) { foreach (i, U; Us) { if (v == vals[i]) return i + 1; } return 0; }This has O(n) behavior, which might be unexpected for the user.
Feb 19 2014
On Wednesday, 19 February 2014 at 19:45:39 UTC, Ali Çehreli wrote:...This is much better explanation - O(n) compile-time algorithm which results in O(1) run-time code generated :)
Feb 19 2014
On Wednesday, 19 February 2014 at 19:45:39 UTC, Ali Çehreli wrote:On 02/19/2014 01:46 AM, Gopan wrote:[...][...]uint among(T, Us...)(T v, Us vals) { foreach (i, U; Us) { writeln("checking ", v, " == ", vals[i]); if (v == vals[i]) return i + 1; } return 0; }You get that output only because you have inserted the writeln() expressions in there. That foreach over variadic template arguments is a "compile-time foreach"[1]. For example, if v were 2 and vals were "1, 2, 3", the original code would be expanded to the following: if (2 == 1) return 1; // i == 0 if (2 == 2) return 2; // i == 1 if (2 == 3) return 3; // i == 3 Since all of those conditions is known to be true or false at compile time, the compiler will reduce the code to the following: return 2; Ali [1] http://ddili.org/ders/d.en/tuples.htmlNope. v and vals are runtime parameters, their values are not known at compile time. What's known at compile time is vals.length.
Feb 19 2014
On Wednesday, 19 February 2014 at 21:52:06 UTC, anonymous wrote:Nope. v and vals are runtime parameters, their values are not known at compile time. What's known at compile time is vals.length.While this is correct, note that there is also an overload that takes `values` as compile-time parameters, generating a switch statement, which is especially interesting because the switch statement is a good opportunity for the compiler to optimise.
Feb 19 2014
On Wednesday, 19 February 2014 at 19:45:39 UTC, Ali Çehreli wrote:On 02/19/2014 01:46 AM, Gopan wrote:Great !! Thank you Ali. I am convinced. You have just taught me my first lesson of CTFE. After removing writeln() from the template, the following statements compiled without error, which means 'n' got computed at compile time. const uint n = among(3, 1,2,5,3); //no error int[n] arr; //no error. And, int v = 3; const uint n = among(v, 1,2,5,3); //Error: variable v cannot be read at compile time int[n] arr; Error: Integer constant expression expected instead of among(v, 1, 2, 5, 3) My conclusion is that when both 'v' and 'vals' are known at compile time, the complexity is perfectly O(1). I would even like to say O(0) at runtime :) And, when either 'v' or 'vals' is not known at compile time, the complexity is O(vals.length) because the computation happens at runtime.uint among(T, Us...)(T v, Us vals) { foreach (i, U; Us) { writeln("checking ", v, " == ", vals[i]); if (v == vals[i]) return i + 1; } return 0; } void main() { uint index = among(3, 1,2,5,3); writeln("Index of 3 in (1,2,5,3) is ", index); } outrput of running the app: checking 3 == 1 checking 3 == 2 checking 3 == 5 checking 3 == 3 Index of 3 in (1,2,5,3) is 4 Or, is my undertanding about Big-O notation of complexitywrong? You get that output only because you have inserted the writeln() expressions in there. That foreach over variadic template arguments is a "compile-time foreach"[1]. For example, if v were 2 and vals were "1, 2, 3", the original code would be expanded to the following: if (2 == 1) return 1; // i == 0 if (2 == 2) return 2; // i == 1 if (2 == 3) return 3; // i == 3 Since all of those conditions is known to be true or false at compile time, the compiler will reduce the code to the following: return 2; Ali[1] http://ddili.org/ders/d.en/tuples.htmlI have already started learning your tutorial. It is great !! Thanks all who posted. I learnt from every post. I love this forum.
Feb 20 2014
On Thursday, 20 February 2014 at 09:42:34 UTC, Gopan wrote:const uint n = among(3, 1,2,5,3); //no error int[n] arr; //no error. And, int v = 3; const uint n = among(v, 1,2,5,3); //Error: variable v cannot be read at compile time int[n] arr; Error: Integer constant expression expected instead of among(v, 1, 2, 5, 3) My conclusion is that when both 'v' and 'vals' are known at compile time, the complexity is perfectly O(1). I would even like to say O(0) at runtime :) And, when either 'v' or 'vals' is not known at compile time, the complexity is O(vals.length) because the computation happens at runtime.When v and vals are known at compile time, that opens up the possibility of CTFE. It's necessary but not sufficient. When a value is needed at compile time (e.g. initializer for a static variable, length of a static array), then CTFE is forced. When the value is not needed at compile time, the optimizer may go for it, but that's not guaranteed.
Feb 20 2014
On Thursday, 20 February 2014 at 13:40:38 UTC, anonymous wrote:When the value is not needed at compile time, the optimizer may go for it, but that's not guaranteed.That's not really true. The optimizer will never try CTFE because of the halting problem. Runtime is runtime, compile-time is compile-time - they are clearly separated by the context of the expression alone.
Feb 20 2014
On Thursday, 20 February 2014 at 17:08:11 UTC, Jakob Ovrum wrote:On Thursday, 20 February 2014 at 13:40:38 UTC, anonymous wrote:LDC does it. The halting problem just dictates that it can't be done in all cases. test.d: --- int among(Value, Values...) (Value value, Values values) { foreach (i, v; values) if (value == v) return i + 1; return 0; } int main() { return among(3, 1,2,5,3); } --- `ldmd2 -O -c test.d && objdump -d -M intel test.o` (excerpt): --- 0000000000000000 <_Dmain>: 0: b8 04 00 00 00 mov eax,0x4 5: c3 ret 6: 66 2e 0f 1f 84 00 00 nop WORD PTR cs:[rax+rax*1+0x0] d: 00 00 00 --- I.e. int main() {return 4;}When the value is not needed at compile time, the optimizer may go for it, but that's not guaranteed.That's not really true. The optimizer will never try CTFE because of the halting problem. Runtime is runtime, compile-time is compile-time - they are clearly separated by the context of the expression alone.
Feb 20 2014
On Thursday, 20 February 2014 at 13:40:38 UTC, anonymous wrote:When a value is needed at compile time (e.g. initializer for a static variable, length of a static array), then CTFE is forced. When the value is not needed at compile time, the optimizer may go for it, but that's not guaranteed.On Thursday, 20 February 2014 at 21:23:08 UTC, anonymous wrote:On Thursday, 20 February 2014 at 17:08:11 UTC, Jakob Ovrum wrote:While it's evaluation at compile time, that optimization should not be called "CTFE". The way I had worded it, it sounded like __ctfe may or may not be true, depending on the optimizer's mood. That's not so, of course.On Thursday, 20 February 2014 at 13:40:38 UTC, anonymous wrote:LDC does it. The halting problem just dictates that it can't be done in all cases.When the value is not needed at compile time, the optimizer may go for it, but that's not guaranteed.That's not really true. The optimizer will never try CTFE because of the halting problem. Runtime is runtime, compile-time is compile-time - they are clearly separated by the context of the expression alone.
Feb 21 2014
On Wednesday, 19 February 2014 at 09:46:04 UTC, Gopan wrote:On Wednesday, 19 February 2014 at 09:09:25 UTC, Tobias PankrathYour understanding is probably fine. It's just one of those technicalities which is a bit misleading. The algorithm is O(vals.length), there's no arguing about that, but since vals.length is a constant, it's also O(1) because O(1) = O(2) = O(1,000,000) = O(k) for any constant k. If you want to get even more anal about it, searching an array is technically O(1) because an array cannot be bigger than size_t.max, and since size_t.max is a constant, O(size_t.max) is O(1). Again, completely misleading but technically correct in an esoteric, academic way.It's O(k) with k = vals.length, which is the number of arguments given to the function, which is a compile time constant and not dependent on any input of the final program. In O(n) the n always relates to the input somehow.I agree that vals.length is known at compile time. But while running the application, it iterates through every val in the input list until a match is found. So, how can that be considered O(1)? ... Or, is my undertanding about Big-O notation of complexity wrong?
Feb 19 2014
On Wednesday, 19 February 2014 at 22:05:49 UTC, Peter Alexander wrote:If you want to get even more anal about it, searching an array is technically O(1) because an array cannot be bigger than size_t.max, and since size_t.max is a constant, O(size_t.max) is O(1). Again, completely misleading but technically correct in an esoteric, academic way.That's basically saying that on any linear bouded automaton, all algorithms (except an infinite loop) are O(1) - which is silly. To the OP: The among() function is defined to template out into multiple versions of itself depending on the number of arguments, rather than using variadic parameter checking during runtime. When the template writes out the definition of the function, if the parameters can be interpreted during compile-time, then the foreach ( O(n) ) will be run by the compiler to detect the correct parameter and the function will be written out as just as a return statement in the body, without the loop. So during runtime, the function would have an O(1) runtime. If the parameters can't be determined during compile-time then the function will be written out with the contents of the foreach copied into the body as many times as there are parameters, making the runtime O(n), where n is the number of parameters to the function. Where people are getting a bit technical is that, because the function is generated by a template the following two calls: among("a", "b", "c"); among("a", "b", "c", "d"); end up declaring two separate functions, one which accepts three parameters and the other which accepts four. The version which accepts three parameters will always run the same speed no matter what you call it - since it does the same thing exactly three times. The version which accepts four parameters is the same. Since they don't have any logic - they just run through a constant list of operations with no loop - one can say that they are O(1). Technically, it's true, but also a mildly pointless distinction to make.
Feb 19 2014
On Wednesday, 19 February 2014 at 22:46:45 UTC, Chris Williams wrote:When the template writes out the definition of the function, if the parameters can be interpreted during compile-time, then the foreach ( O(n) ) will be run by the compiler to detect the correct parameter and the function will be written out as just as a return statement in the body, without the loop. So during runtime, the function would have an O(1) runtime.That's an optimization the compiler may do, but it's not guaranteed. And `among` is not special in this regard. [...]The version which accepts three parameters will always run the same speed no matter what you call it - since it does the same thing exactly three times.Well, it returns as soon as the value is found. So among(v, "b", "c") finishes quicker when v is "b" than when it's "c" or "a". It only does the same thing three times when it didn't find the value after two times.
Feb 19 2014
On Thursday, 20 February 2014 at 00:36:38 UTC, anonymous wrote:On Wednesday, 19 February 2014 at 22:46:45 UTC, Chris Williams wrote:I think "expected to do" is probably the better phrasing at the moment. Traits/annotations would be largely unusable unless foreach was able to behave as a static construct. And since there is no way for one to declare "static foreach", the only way foreach can be used for running over traits, is for the compiler to detect constant values in the parameters to the statement.When the template writes out the definition of the function, if the parameters can be interpreted during compile-time, then the foreach ( O(n) ) will be run by the compiler to detect the correct parameter and the function will be written out as just as a return statement in the body, without the loop. So during runtime, the function would have an O(1) runtime.That's an optimization the compiler may do, but it's not guaranteed. And `among` is not special in this regard.
Feb 19 2014
On Thursday, 20 February 2014 at 01:41:25 UTC, Chris Williams wrote:On Thursday, 20 February 2014 at 00:36:38 UTC, anonymous wrote:The foreach is definitely always evaluated at compile time. That's independent of the values being known statically. It generates a couple of if statements. But optimizing those if statements away for static values is not guaranteed or expected. In other words, with code, this: --- uint among(Value, Values...) (Value value, Values values) { foreach (i, v; values) if (value == v) return i + 1; return 0; } auto foo = among("a", "b", "c"); --- definitely boils down to something like this: --- uint among_string_string_string(string value, string v0, string v2) { if(value == v0) return 1; if(value == v1) return 2; return 0; } auto foo = among_string_string_string("a", "b", "c"); --- But it's a mere optimization to get to this: --- auto foo = 0; ---On Wednesday, 19 February 2014 at 22:46:45 UTC, Chris Williams wrote:I think "expected to do" is probably the better phrasing at the moment. Traits/annotations would be largely unusable unless foreach was able to behave as a static construct. And since there is no way for one to declare "static foreach", the only way foreach can be used for running over traits, is for the compiler to detect constant values in the parameters to the statement.When the template writes out the definition of the function, if the parameters can be interpreted during compile-time, then the foreach ( O(n) ) will be run by the compiler to detect the correct parameter and the function will be written out as just as a return statement in the body, without the loop. So during runtime, the function would have an O(1) runtime.That's an optimization the compiler may do, but it's not guaranteed. And `among` is not special in this regard.
Feb 19 2014
On Thursday, 20 February 2014 at 02:10:50 UTC, anonymous wrote:The foreach is definitely always evaluated at compile time. That's independent of the values being known statically. It generates a couple of if statements. But optimizing those if statements away for static values is not guaranteed or expected.Ah, roger.
Feb 19 2014
On Wednesday, 19 February 2014 at 22:05:49 UTC, Peter Alexander wrote:O(1) = O(2) = O(1,000,000) = O(k) for any constant k. If you want to get even more anal about it, searching an array is technically O(1) because an array cannot be bigger than size_t.max, and since size_t.max is a constant, O(size_t.max) is O(1). Again, completely misleading but technically correct in an esoteric, academic way.That's a useless oversimplification since for all sizes lesser than size_t.max there is a better input-length depending bound for the complexity. Just like every algorithm that is in O(n) is also in O(n^3), but usually you are interested • in the smallest upper bound • that depends on the input. The input dependency is important because it lets us reason about how the algorithm might scale. Now, let n be the size of input, for example in this application, n is input.length. -- int[] input = getInputFromSomeWhere(); foreach(i; input) { if(i.among!(1, 42, 12)) { writeln(i); break; } } -- Now, among does not depend of input.length and it will never do in any application. That is the sole reason I say it's O(1).
Feb 20 2014
} -- Now, among does not depend of input.length and it will never do in any application. That is the sole reason I say it's O(1).The application is of course O(n).
Feb 20 2014