digitalmars.D.bugs - [Issue 10670] New: std.algorithm.reduce: no-seed initialization wrong design
- d-bugmail puremagic.com (45/45) Jul 19 2013 http://d.puremagic.com/issues/show_bug.cgi?id=10670
- d-bugmail puremagic.com (13/13) Jul 19 2013 http://d.puremagic.com/issues/show_bug.cgi?id=10670
- d-bugmail puremagic.com (18/18) Jul 19 2013 http://d.puremagic.com/issues/show_bug.cgi?id=10670
- d-bugmail puremagic.com (36/36) Jul 22 2013 http://d.puremagic.com/issues/show_bug.cgi?id=10670
- d-bugmail puremagic.com (15/19) Jul 22 2013 http://d.puremagic.com/issues/show_bug.cgi?id=10670
- d-bugmail puremagic.com (13/27) Jul 22 2013 http://d.puremagic.com/issues/show_bug.cgi?id=10670
- d-bugmail puremagic.com (8/13) Jul 22 2013 http://d.puremagic.com/issues/show_bug.cgi?id=10670
http://d.puremagic.com/issues/show_bug.cgi?id=10670 Summary: std.algorithm.reduce: no-seed initialization wrong design Product: D Version: D2 Platform: All OS/Version: All Status: NEW Severity: normal Priority: P2 Component: Phobos AssignedTo: monarchdodra gmail.com ReportedBy: monarchdodra gmail.com reduce has a "no-seed" implementation provided: "The one-argument version $(D reduce!fun(range)) works similarly, but it uses the first element of the range as the seed (the range must be non-empty)." I question this design as invalid: The first element is never passed through the transform function. There is no reason it be given any "special treatment". Here is a trivial example of why the design is (IMO) wrong: //---- import std.stdio, std.algorithm, std.range; void main(string[] args) { auto arr = [1, 1]; auto sumDoubles1 = reduce!"a + 2*b"(0, arr); auto sumDoubles2 = reduce!"a + 2*b"(arr); writeln(sumDoubles1); //4 writeln(sumDoubles2); //3 (!) } //---- I really can't see any universe where you could justify that return value... The correct seed value should be "Seed seed = fun(Seed.init, r.front)": This means the first element is correctly "processed". Of course, doing this really just boils down to seed being Seed.init. An added "bonus" is that it makes simple "isIterable" types much more efficient, as they don't have to check "isInitialized" on every iteration. -------- Unsure if this is "ER", "wrong-code" or what. Not sure if changing the behavior is acceptable (IMO, it should be). I was re-writing reduce already, so please provide your feedback on the issue so that I can take it into account :) -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 19 2013
http://d.puremagic.com/issues/show_bug.cgi?id=10670 Peter Neubauer <peterneubauer2 gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |peterneubauer2 gmail.com 07:34:43 PDT --- float.init is nan, so this change would require all (float[]).reduce!"a+b" calls to be rewritten with an explicit seed. I imagine that this is a relatively common use case, while I've never had to reduce!"a+2b" anything. Users of the latter case should get the burden of the explicit seed. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 19 2013
http://d.puremagic.com/issues/show_bug.cgi?id=10670 hsteoh quickfur.ath.cx changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |hsteoh quickfur.ath.cx I've run into this issue before. I agree that if no seed value is provided, it should use ElementType!range.init, NOT range.front. In the case of floats, well... I'd argue that using the seedless variety of reduce on a float range is already a bug, so it should be fixed anyway. Alternatively, since float.init is always NaN, it makes no sense to use the seedless variety of reduce, so we could just static assert(false) if isFloatingPointType!(ElementType!range). This will break existing code but at least it will point out why the code is breaking, instead of silently changing whatever the current behaviour is. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 19 2013
http://d.puremagic.com/issues/show_bug.cgi?id=10670 I've done the dev. Here is my preliminary experience with it: For starters, there *is* some convenience in using "front as seed" for calls such as: //---- double[] a = [ 3, 4, 7, 11, 3, 2, 5 ]; // Compute minimum and maximum in one pass auto r = reduce!(min, max)(a); //---- Here, even a straight up "0.0" is a bad seed. The code needs to be fixed as: //---- double[] a = [ 3, 4, 7, 11, 3, 2, 5 ]; // Compute minimum and maximum in one pass auto r = reduce!(min, max)(tuple(double.max, double.min), a); //---- Not very convenient. But still, nothing surmountable. But at the same time, if you look at the existing unittests, you can see some strange things such as: //---- // two funs auto r2 = reduce!("a + b", "a - b")(tuple(0.0, 0.0), a); writeln(r2); assert(r2[0] == 7 && r2[1] == -7); auto r3 = reduce!("a + b", "a - b")(a); assert(r3[0] == 7 && r3[1] == -1); //---- I don't know what that was trying to prove, but it shows that no-seed can do some really strange things. The "assert not float (or char for that matter)" was very efficient in catching float seeds, and catched almost all of the "wrong usage" cases. The only one that it did not catch was the above case. But such code doesn't make sense to me. I think "silent breakage" would really be minimal. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 22 2013
http://d.puremagic.com/issues/show_bug.cgi?id=10670 bearophile_hugs eml.cc changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |bearophile_hugs eml.ccIn the case of floats, well... I'd argue that using the seedless variety of reduce on a float range is already a bug,I don't think it's a bug, this is Python:24.0 I have a ton of D code that relies on such behavour of D reduce. I suggest to just swap the seed and sequence arguments of reduce, to support UFCS chains, and leave the rest of reduce as it is. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------a = [2.0, 3.0, 4.0] reduce(lambda x, y: x * y, a)
Jul 22 2013
http://d.puremagic.com/issues/show_bug.cgi?id=10670You could be right, and I've seen use cases where it makes a lot of sense, eg: reduce!`a ~ " " ~ b`(["hello", "world"]);In the case of floats, well... I'd argue that using the seedless variety of reduce on a float range is already a bug,I don't think it's a bug, this is Python:24.0 I have a ton of D code that relies on such behavour of D reduce.a = [2.0, 3.0, 4.0] reduce(lambda x, y: x * y, a)I suggest to just swap the seed and sequence arguments of reduce, to support UFCS chains, and leave the rest of reduce as it is.I've tried in http://d.puremagic.com/issues/show_bug.cgi?id=8755, but the conclusion is that it is not possible without breaking code. And when I say "breaking code", I mean "still compiles but silently generates different results". That, and no migration plan was found. Short of finding a new name for a new function, I have not been able to achieve this goad. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 22 2013
http://d.puremagic.com/issues/show_bug.cgi?id=10670I've tried in http://d.puremagic.com/issues/show_bug.cgi?id=8755, but the conclusion is that it is not possible without breaking code. And when I say "breaking code", I mean "still compiles but silently generates different results". That, and no migration plan was found. Short of finding a new name for a new function, I have not been able to achieve this goad.I understand. I suggest to introduce a new function named like your "fold" and slowly deprecate "reduce". -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 22 2013