www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - problem with multiwayMerge and chunkBy

reply Matthew Gamble <gamblemj gmail.com> writes:
Dear most helpful and appreciated D community,

I'm a non-pro academic biologist trying to code a modeler of 
transcription in D. I've run into a small roadblock. Any help 
would be greatly appreciated. I'm hoping someone can tell me why 
I get the following run-time error from this code. I've reduced 
it to something simple:

         import std.algorithm;
         import std.range;

	auto d =[2,4,6,8];
	auto e =[1,2,3,5,7];
	auto f =[d,e];

	writeln(f.multiwayMerge.chunkBy!"a == b");//error happens
         writeln(f.multiwayMerge.array.chunkBy!"a == b");//no 
error, but there must be a better way!

My understanding is that chunkBy should be able to take an input 
range. Is that not true? I'm trying to get a merged sorted view 
of two sorted ranges followed by merging records based on a 
predicate without allocating memory or swapping the underlying 
values. Speed will be very important at the end of the day and 
sticking the ".array" in the middle kills me, given the size of 
the actual ranges.

Thank you so much for your help. The full tale-of-the-tape is 
below.

Thanks,
Matt

error:
[[1], [2, 2], [3], [4], [5], [6], [7], [8
core.exception.AssertError C:\D\dmd2\windows\bin\..\..\src\phobos\std\range
primitives.d(2055): Attempting to popFront() past the end of an array of int
----------------
0x00007FF7A30775D3 in d_assert_msg
0x00007FF7A303E497 in std.range.primitives.popFront!int.popFront 
at 
C:\D\dmd2\windows\bin\..\..\src\phobos\std\range\primitives.d(2056)
0x00007FF7A304450C in std.algorithm.setops.MultiwayMerge!("a < 
b", int[][]).MultiwayMerge.popFront at 
C:\D\dmd2\windows\bin\..\..\src\phobos\std\algorithm\setops.d(877)
0x00007FF7A304A700 in 
std.algorithm.iteration.ChunkByChunkImpl!(binaryFun, 
std.algorithm.setops.MultiwayMerge!("a < b", 
int[][])).ChunkByChunkImpl.popFront at 
C:\D\dmd2\windows\bin\..\..\src\phobos\std\algorithm\iteration.d(1624)
0x00007FF7A3054877 in 
std.format.formatRange!(std.stdio.File.LockingTextWriter, 
std.algorithm.iteration.ChunkByChunkImpl!(binaryFun, 
MultiwayMerge!("a < b", int[][])), char).formatRange at 
C:\D\dmd2\windows\bin\..\..\src\phobos\std\format.d(2960)
0x00007FF7A3054796 in 
std.format.formatValue!(std.stdio.File.LockingTextWriter, 
std.algorithm.iteration.ChunkByChunkImpl!(binaryFun, 
MultiwayMerge!("a < b", int[][])), char).formatValue at 
C:\D\dmd2\windows\bin\..\..\src\phobos\std\format.d(3676)
0x00007FF7A3054704 in 
std.format.formatElement!(std.stdio.File.LockingTextWriter, 
std.algorithm.iteration.ChunkByChunkImpl!(binaryFun, 
MultiwayMerge!("a < b", int[][])), char).formatElement at 
C:\D\dmd2\windows\bin\..\..\src\phobos\std\format.d(3180)
0x00007FF7A305410E in 
std.format.formatRange!(std.stdio.File.LockingTextWriter, 
std.algorithm.iteration.ChunkByImpl!("a == b", MultiwayMerge!("a 
< b", int[][])), char).formatRange at 
C:\D\dmd2\windows\bin\..\..\src\phobos\std\format.d(2964)
0x00007FF7A3053F66 in 
std.format.formatValue!(std.stdio.File.LockingTextWriter, 
std.algorithm.iteration.ChunkByImpl!("a == b", MultiwayMerge!("a 
< b", int[][])), char).formatValue at 
C:\D\dmd2\windows\bin\..\..\src\phobos\std\format.d(3676)
0x00007FF7A304B053 in 
std.format.formattedWrite!(std.stdio.File.LockingTextWriter, 
char, std.algorithm.iteration.ChunkByImpl!("a == b", 
MultiwayMerge!("a < b", int[][]))).formattedWrite at 
C:\D\dmd2\windows\bin\..\..\src\phobos\std\format.d(568)
0x00007FF7A3069BA2 in 
std.stdio.File.write!(std.algorithm.iteration.ChunkByImpl!("a == 
b", MultiwayMerge!("a < b", int[][])), char).write at 
C:\D\dmd2\windows\bin\..\..\src\phobos\std\stdio.d(1407)
0x00007FF7A30699DB in 
std.stdio.writeln!(std.algorithm.iteration.ChunkByImpl!("a == b", 
MultiwayMerge!("a < b", int[][]))).writeln at 
C:\D\dmd2\windows\bin\..\..\src\phobos\std\stdio.d(3604)
0x00007FF7A303E258 in gappedIntervals.__unittestL70_4 at 
C:\Users\matth\Documents\GambleLabCodeBaseLocal\intervals\gappedIntervals.d(78)
0x00007FF7A3061917 in void gappedIntervals.__modtest()
0x00007FF7A307C08B in int 
core.runtime.runModuleUnitTests().__foreachbody1(object.ModuleInfo*)
0x00007FF7A3082663 in int object.ModuleInfo.opApply(scope int 
delegate(object.ModuleInfo*)).__lambda2(immutable(object.ModuleInfo*))
0x00007FF7A308594F in int rt.minfo.moduleinfos_apply(scope int 
delegate(immutable(object.ModuleInfo*))).__foreachbody2(ref 
rt.sections_win64.SectionGroup)
0x00007FF7A30858BF in int rt.minfo.moduleinfos_apply(scope int 
delegate(immutable(object.ModuleInfo*)))
0x00007FF7A3082637 in int object.ModuleInfo.opApply(scope int 
delegate(object.ModuleInfo*))
0x00007FF7A307C005 in runModuleUnitTests
0x00007FF7A3070B7D in void rt.dmain2._d_run_main(int, char**, 
extern (C) int function(char[][])*).runAll()
0x00007FF7A3070ADF in void rt.dmain2._d_run_main(int, char**, 
extern (C) int function(char[][])*).tryExec(scope void delegate())
0x00007FF7A30708DF in d_run_main
0x00007FF7A306F9F2 in __entrypoint.main
0x00007FF7A30BA0C5 in __scrt_common_main_seh at 
f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl(283)
0x00007FF8876A2774 in BaseThreadInitThunk
0x00007FF887DA0D51 in RtlUserThreadStart
Nov 04
parent reply Nicholas Wilson <iamthewilsonator hotmail.com> writes:
On Saturday, 4 November 2017 at 18:57:17 UTC, Matthew Gamble 
wrote:
 Dear most helpful and appreciated D community,

 I'm a non-pro academic biologist trying to code a modeler of 
 transcription in D. I've run into a small roadblock. Any help 
 would be greatly appreciated. I'm hoping someone can tell me 
 why I get the following run-time error from this code. I've 
 reduced it to something simple:

         import std.algorithm;
         import std.range;

 	auto d =[2,4,6,8];
 	auto e =[1,2,3,5,7];
 	auto f =[d,e];

 	writeln(f.multiwayMerge.chunkBy!"a == b");//error happens
         writeln(f.multiwayMerge.array.chunkBy!"a == b");//no 
 error, but there must be a better way!

 My understanding is that chunkBy should be able to take an 
 input range. Is that not true? I'm trying to get a merged 
 sorted view of two sorted ranges followed by merging records 
 based on a predicate without allocating memory or swapping the 
 underlying values. Speed will be very important at the end of 
 the day and sticking the ".array" in the middle kills me, given 
 the size of the actual ranges.
It should, this looks like a bug somewhere, please file one at issues.dlang.org/ . in the mean time struct Replicate(T) { Tuple!(T, uint) e; property bool empty() { return e[1] == 0 ; } property auto front() {return e[0]; } void popFront() { --e[1]; } } Replicate!T replicate(T)(Tuple!(T, uint) e) { return typeof(return)(e); } f.multiwayMerge.group!"a == b".map!(replicate).writeln; Does the same thing provided your predicate is "a == b".
Nov 04
parent reply Matthew Gamble <gamblemj gmail.com> writes:
On Sunday, 5 November 2017 at 03:21:06 UTC, Nicholas Wilson wrote:
 On Saturday, 4 November 2017 at 18:57:17 UTC, Matthew Gamble 
 wrote:
 [...]
It should, this looks like a bug somewhere, please file one at issues.dlang.org/ . in the mean time struct Replicate(T) { Tuple!(T, uint) e; property bool empty() { return e[1] == 0 ; } property auto front() {return e[0]; } void popFront() { --e[1]; } } Replicate!T replicate(T)(Tuple!(T, uint) e) { return typeof(return)(e); } f.multiwayMerge.group!"a == b".map!(replicate).writeln; Does the same thing provided your predicate is "a == b".
Thanks Nicholas. I posted the bug as you suggested. My predicate is not quite a == b, otherwise I would never have needed chunkBy in the first place. But thanks, I'm pursuing a workaround. Matt
Nov 05
parent reply Nicholas Wilson <iamthewilsonator hotmail.com> writes:
On Sunday, 5 November 2017 at 13:32:57 UTC, Matthew Gamble wrote:
 On Sunday, 5 November 2017 at 03:21:06 UTC, Nicholas Wilson 
 wrote:
 On Saturday, 4 November 2017 at 18:57:17 UTC, Matthew Gamble 
 wrote:
 [...]
It should, this looks like a bug somewhere, please file one at issues.dlang.org/ . in the mean time struct Replicate(T) { Tuple!(T, uint) e; property bool empty() { return e[1] == 0 ; } property auto front() {return e[0]; } void popFront() { --e[1]; } } Replicate!T replicate(T)(Tuple!(T, uint) e) { return typeof(return)(e); } f.multiwayMerge.group!"a == b".map!(replicate).writeln; Does the same thing provided your predicate is "a == b".
Thanks Nicholas. I posted the bug as you suggested. My predicate is not quite a == b, otherwise I would never have needed chunkBy in the first place. But thanks, I'm pursuing a workaround. Matt
One thing you might try is instead of using .array to eagerly evaluate the whole range, eagerly evaluate only a part (say 128 elements) and .joiner them. import std.range : chunks; f.multiwayMerge.chunks(128).joiner.chunkBy!(pred).writeln; since it seems to be the iteration that stuff things up and this changes it.
Nov 05
parent Nicholas Wilson <iamthewilsonator hotmail.com> writes:
On Sunday, 5 November 2017 at 22:47:10 UTC, Nicholas Wilson wrote:
 f.multiwayMerge.chunks(128).joiner.chunkBy!(pred).writeln;

 since it seems to be the iteration that stuff things up and 
 this changes it.
If that doesn't work you could try rolling your own version of chunk with `take` and a static array.
Nov 05