## digitalmars.D.bugs - [Issue 5968] New: Two changes for std.algorithm.group()?

d-bugmail puremagic.com writes:
```http://d.puremagic.com/issues/show_bug.cgi?id=5968

--- Comment #0 from bearophile_hugs eml.cc 2011-05-09 03:14:50 PDT ---
Andrej Mitrovic has asked to split the following array in three arrays/ranges,
according to the splitting predicate x<32:

[64, 64, 64, 32, 31, 16, 32, 33, 64]

A solution using group():

import std.stdio, std.algorithm;

void main() {
auto arr = [64, 64, 64, 32, 31, 16, 32, 33, 64];

int last = 0;
foreach (g; group!q{ (a < 32) == (b < 32) }(arr)) {
writeln(arr[last .. last+g[1]]);
last += g[1];
}
}

Output:
[64, 64, 64, 32]
[31, 16]
[32, 33, 64]

Andrei has suggested the second item of the tuples that group() yields to be a
lazy range instead just of counter (untested code). This is an improvement, and
it makes group() closer to the Python itertools.groupby(). With this change the
code becomes simpler:

import std.stdio, std.algorithm;

void main() {
auto arr = [64, 64, 64, 32, 31, 16, 32, 33, 64];

foreach (g; group!q{ (a < 32) == (b < 32) }(arr))
writeln(g[1]); // g[1] is lazy
}

In Python groupby uses a key mapping function, like D schwartzSort(), that's
more handy:

from itertools import groupby
arr = [64, 64, 64, 32, 31, 16, 32, 33, 64]
[list(g) for h,g in groupby(arr, key = lambda x: x < 32)]

[[64, 64, 64, 32], [31, 16], [32, 33, 64]]

I suggest to change D group() to follow the Python groupby() design on this
too. With this the D code improves further (untested):

import std.stdio, std.algorithm;

void main() {
auto arr = [64, 64, 64, 32, 31, 16, 32, 33, 64];

foreach (g; group!q{ a < 32 }(arr))
writeln(g[1]);
}

Implementation note: unlike schwartzSort() there isn't a need to memorize the
results of all the key mapping functions, this avoids slow memory allocations.

With tuple unpacking syntax sugar:

import std.stdio, std.algorithm;

void main() {
auto arr = [64, 64, 64, 32, 31, 16, 32, 33, 64];

foreach ((h, g); group!q{ a < 32 }(arr))
writeln(g);
}

May 09 2011
d-bugmail puremagic.com writes:
```http://d.puremagic.com/issues/show_bug.cgi?id=5968

--- Comment #1 from bearophile_hugs eml.cc 2013-02-12 17:42:44 PST ---
I suggest to introduce a new function std.algorithm.groupFull that yields
tuples where the second field of the tuple is a lazy range of all the grouped
items, as in Python groupby.

This Python2 program returns all the longest words that have ordered chars:

from itertools import groupby
o = (w for w in map(str.strip, open("words.txt")) if sorted(w)==list(w))
print list(next(groupby(sorted(o, key=len, reverse=True), key=len))[1])

A similar program using groupFull:

import std.stdio, std.algorithm, std.range, std.file, std.string;

void main() {
"words.txt"
.splitter()
.filter!isSorted()
.array()
.sort!q{a.length > b.length}()
.groupFull!q{a.length == b.length}()
.front[1]
.writeln();
}

Feb 12 2013
d-bugmail puremagic.com writes:
```http://d.puremagic.com/issues/show_bug.cgi?id=5968

Peter Alexander <peter.alexander.au gmail.com> changed:

--- Comment #2 from Peter Alexander <peter.alexander.au gmail.com> 2013-09-22
07:17:04 PDT ---
*** Issue 11097 has been marked as a duplicate of this issue. ***

Sep 22 2013