## digitalmars.D - Function Proposal: std.algorithm.iteration : cumulativeSum

• e-y-e (102/102) Nov 01 2016 ===
• e-y-e (3/21) Nov 01 2016 Apologies to those reading on mobile or in a non-monospace font
• Ivan Kazmenko (4/8) Nov 01 2016 DMD 2.072 just got cumulativeFold:
• e-y-e (2/10) Nov 01 2016 damn, that was a typo [cumulativeFold -> cumulativeSum]
• e-y-e (3/7) Nov 01 2016 similarly, in the first para, cumulativeSum!((a, b) => a + b)(r,
• Ivan Kazmenko (8/16) Nov 01 2016 Ouch, and I completely misread the TL;DR paragraph, sorry for the
• e-y-e (8/16) Nov 01 2016 To be entirely honest, I haven't experienced any _problems_ with
• jmh530 (5/8) Nov 01 2016 I was happy to see that.
• Meta (3/15) Nov 02 2016 I think there's an obvious reason as to why they're not called
• jmh530 (12/14) Nov 02 2016 I almost always see languages call it cumsum rather than
• e-y-e (2/3) Nov 02 2016 bump : any more opinions?
e-y-e <yurtqbqn grr.la> writes:
```===

TL;DR:
The proposed function takes an input range and an optional seed
and provides an input range containing the intermediate results
of the summation of the given range. As of 2.072 this behaviour
can be emulated by cumulativeSum!((a, b) => a + b)(r, s), but
cumulativeSum involves a specialization for accurate summation of
floating point values and redirects to cumulativeFold for
non-floating point values. The proposed function would be useful
for those doing basic statistical analysis on data sets, but
would also have applications in other fields. Please let me know
in this thread whether you would have a use for this function (or
if you think it should/shouldn't be in phobos)

===

I'd like to propose the function cumulativeFold as a new addition
to std.algorithm.iteration. I have already opened a pull request
[1] for this addition so the full implementation is available
there. The function signatures are:

auto cumulativeSum(Range)(Range r)
if (isInputRange!Range && __traits(compiles, r.front +
r.front))

auto cumulativeSum(Range, Seed)(Range r, Seed s)
if (isInputRange!Range && __traits(compiles, s = s + r.front))

The function returns an input range that contains the
intermediate results of a summation on all of the elements of a
given range. For more details see Prefix Sum [2].

My motivation for adding this function to phobos was originally
that I came across a need for it. I was looking into genetic
algorithms and I wanted to implement some variation of the
Stochastic Universal Sampling [3] selection method, which
requires the fitness values of the population to be sorted and
then selected based on the cumulative sum up to that point.
Phobos was the first place I looked to for an implementation of
cumulativeSum, as I knew that it had an implementation of sum,
and cumulativeSum is a great use of D's ranges that I assumed
would be in phobos. I couldn't find it, but found cumulativeFold
instead. However, from reading the docs on sum, I knew that sum
was a specialization of fold for accurate floating point
summation, and given that I would be summing floating point
values it would be better if the algorithm I used also involved
this type of specialization.

With the knowledge that cumulativeFold is now in phobos, I
realized that this presented quite an obvious gap in this area of
summation and reduction. This gap is best illustrated by this
table:

| Provides no intermediate results | Provides
intermediate results

-------------|----------------------------------|-------------------------------
Not          |                                  |
specialized  |               fold               |
cumulativeFold
for accurate |                                  |
summation    |                                  |

-------------|----------------------------------|-------------------------------
Specialized  |                                  |
for accurate |                sum               |
X
summation    |                                  |

So now what would people other than me use this function for, or
in other words, why should it be in phobos? Firstly, from a
purely logical point of view, cumulative sums can be a useful way
of analysing data. Once the data can be converted into a
cumulative sum, then it is trivial to know what the current
running total is when consuming the data. IE rather than keeping
state like so [trivial example]:

void displayDataset(Data)(Data d, double total)
{
double sum = 0.0;

while (true)
{
if (d.empty) break;
sum += d.front;
if (sum > total) break;
writeln("Data point: ", d.front, ", sum: ", sum);
d.popFront;
}
}

Range based code can now be written with ease to perform the same
job:

void displayDataset(Data)(Data d, double total)
{
d
.zip(d.cumulativeSum)
.until!(t => t[1] > total)
.each!(t => writeln("Data point: ", t[0], ", sum:",
t[1]));
}

Some other useful applications are:
- Graphing an integral of a range of y values without
calculating the actual integral.
- Computing a series summation with increasing accuracy (a la
Fourier).
- Keeping a running mean of incoming data.

Thanks for reading if you got this far, let me know whether you
love/hate the idea.
P.S: its late here so I may only be able to read/respond tomorrow.

===

[1] https://github.com/dlang/phobos/pull/4881
[2] https://en.wikipedia.org/wiki/Prefix_sum
[3] https://en.wikipedia.org/wiki/Stochastic_universal_sampling
```
Nov 01 2016
e-y-e <yurtqbqn grr.la> writes:
```On Tuesday, 1 November 2016 at 21:52:40 UTC, e-y-e wrote:
...
this table:

| Provides no intermediate results | Provides
intermediate results

-------------|----------------------------------|-------------------------------
Not          |                                  |
specialized  |               fold               |
cumulativeFold
for accurate |                                  |
summation    |                                  |

-------------|----------------------------------|-------------------------------
Specialized  |                                  |
for accurate |                sum               |
X
summation    |                                  |

...

Apologies to those reading on mobile or in a non-monospace font
for this table.
```
Nov 01 2016
Ivan Kazmenko <gassa mail.ru> writes:
```On Tuesday, 1 November 2016 at 21:52:40 UTC, e-y-e wrote:
I'd like to propose the function cumulativeFold as a new
pull request [1] for this addition so the full implementation
is available there. The function signatures are:

DMD 2.072 just got cumulativeFold:

https://dlang.org/changelog/2.072.0.html#std-algorithm-iteration-cumulativeFold

https://dlang.org/phobos/std_algorithm_iteration.html#cumulativeFold
```
Nov 01 2016
e-y-e <yurtqbqn grr.la> writes:
```On Tuesday, 1 November 2016 at 22:06:36 UTC, Ivan Kazmenko wrote:
On Tuesday, 1 November 2016 at 21:52:40 UTC, e-y-e wrote:
I'd like to propose the function cumulativeFold as a new
pull request [1] for this addition so the full implementation
is available there. The function signatures are:

DMD 2.072 just got cumulativeFold:

https://dlang.org/changelog/2.072.0.html#std-algorithm-iteration-cumulativeFold

https://dlang.org/phobos/std_algorithm_iteration.html#cumulativeFold

damn, that was a typo [cumulativeFold -> cumulativeSum]
```
Nov 01 2016
e-y-e <yurtqbqn grr.la> writes:
```On Tuesday, 1 November 2016 at 22:09:50 UTC, e-y-e wrote:
On Tuesday, 1 November 2016 at 22:06:36 UTC, Ivan Kazmenko
wrote:
...

damn, that was a typo [cumulativeFold -> cumulativeSum]

similarly, in the first para, cumulativeSum!((a, b) => a + b)(r,
s) -> cumulativeFold!((a, b) => a + b)(r, s)
```
Nov 01 2016
Ivan Kazmenko <gassa mail.ru> writes:
```On Tuesday, 1 November 2016 at 22:13:29 UTC, e-y-e wrote:
On Tuesday, 1 November 2016 at 22:09:50 UTC, e-y-e wrote:
On Tuesday, 1 November 2016 at 22:06:36 UTC, Ivan Kazmenko
wrote:
...

damn, that was a typo [cumulativeFold -> cumulativeSum]

similarly, in the first para, cumulativeSum!((a, b) => a +
b)(r, s) -> cumulativeFold!((a, b) => a + b)(r, s)

Ouch, and I completely misread the TL;DR paragraph, sorry for the
noise!

It's logical to have cumulativeSum specialized from
cumulativeFold when sum is a specialization of fold.

I have yet to write a program which relies on Kahan summation,
but I can imagine that being useful.

Ivan Kazmenko.
```
Nov 01 2016
e-y-e <yurtqbqn grr.la> writes:
```On Tuesday, 1 November 2016 at 22:22:17 UTC, Ivan Kazmenko wrote:
On Tuesday, 1 November 2016 at 22:13:29 UTC, e-y-e wrote:
...

Ouch, and I completely misread the TL;DR paragraph, sorry for
the noise!

My fault, note to self: always re-read everything.

It's logical to have cumulativeSum specialized from
cumulativeFold when sum is a specialization of fold.

I have yet to write a program which relies on Kahan summation,
but I can imagine that being useful.

To be entirely honest, I haven't experienced any _problems_ with
the accuracy of naive summation. But Kahan summation does improve
accuracy in some cases and I want that assurance.

It does have its own inaccuracies though and I think (may be
wrong, again, its late) a flag like -ffastmath would probably
eliminate any accuracy gain.
```
Nov 01 2016
jmh530 <john.michael.hall gmail.com> writes:
```On Tuesday, 1 November 2016 at 22:06:36 UTC, Ivan Kazmenko wrote:
DMD 2.072 just got cumulativeFold:

https://dlang.org/changelog/2.072.0.html#std-algorithm-iteration-cumulativeFold

https://dlang.org/phobos/std_algorithm_iteration.html#cumulativeFold

I was happy to see that.

I'm surprised there wasn't more angst about the name.
cumulativeSum should be cumsum, which implies that cumulativeFold
should be cumfold, IMO.
```
Nov 01 2016
Meta <jared771 gmail.com> writes:
```On Wednesday, 2 November 2016 at 00:41:38 UTC, jmh530 wrote:
On Tuesday, 1 November 2016 at 22:06:36 UTC, Ivan Kazmenko
wrote:
DMD 2.072 just got cumulativeFold:

https://dlang.org/changelog/2.072.0.html#std-algorithm-iteration-cumulativeFold

https://dlang.org/phobos/std_algorithm_iteration.html#cumulativeFold

I was happy to see that.

I'm surprised there wasn't more angst about the name.
cumulativeSum should be cumsum, which implies that
cumulativeFold should be cumfold, IMO.

I think there's an obvious reason as to why they're not called
that.
```
Nov 02 2016
jmh530 <john.michael.hall gmail.com> writes:
```On Wednesday, 2 November 2016 at 20:03:59 UTC, Meta wrote:
I think there's an obvious reason as to why they're not called
that.

I almost always see languages call it cumsum rather than
cumulativeSum.

R -
https://stat.ethz.ch/R-manual/R-devel/library/base/html/cumsum.html
Matlab - https://www.mathworks.com/help/matlab/ref/cumsum.html
Python -
https://docs.scipy.org/doc/numpy/reference/generated/numpy.cumsum.html

The only exceptions I could find were languages that called it
something else, like Accumulate in Mathematica or the scan
functions noted in the cumulativeFold function doc.

In practice, I would probably just create an alias.
```
Nov 02 2016
e-y-e <yurtqbqn grr.la> writes:
```On Tuesday, 1 November 2016 at 21:52:40 UTC, e-y-e wrote:
...

bump : any more opinions?
```
Nov 02 2016