www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Combining "chunkBy" and "until" algorithms

reply Jacob Carlborg <doob me.com> writes:
I have a file with a bunch of lines I want to process. I want to process 
these lines line by line. Most of these lines have the same pattern. 
Some of the lines have a different pattern. I want to bundle those 
lines, which have a non-standard pattern, together with the last line 
that had the standard pattern. The number of lines with a non-standard 
pattern is unknown. Are there some algorithms in Phobos that can help 
with this?

Maybe an algorithm combining "chunkBy" and "until" could do it?

Currently I'm using a standard for loop iterating over the lines. I'm 
always looking at the current line and the next line. When the current 
line is the standard pattern and the next line is is not, I do a 
separate loop until I see a standard pattern again, collecting the lines 
with the non-standard pattern in an array.

-- 
/Jacob Carlborg
Nov 04 2016
next sibling parent reply Edwin van Leeuwen <edder tkwsping.nl> writes:
On Friday, 4 November 2016 at 08:04:12 UTC, Jacob Carlborg wrote:
 Currently I'm using a standard for loop iterating over the 
 lines. I'm always looking at the current line and the next 
 line. When the current line is the standard pattern and the 
 next line is is not, I do a separate loop until I see a 
 standard pattern again, collecting the lines with the 
 non-standard pattern in an array.
Could you filter [1] for the non standard pattern? Filter is lazy, so will only start looking for the next when the current one has been "handled". [1] https://dlang.org/phobos/std_algorithm_iteration.html#.filter
Nov 04 2016
parent Jacob Carlborg <doob me.com> writes:
On 2016-11-04 16:23, Edwin van Leeuwen wrote:

 Could you filter [1] for the non standard pattern? Filter is lazy, so
 will only start looking for the next when the current one has been
 "handled".
Hmm, no I don't think so. Do you have an example of how this would work? -- /Jacob Carlborg
Nov 05 2016
prev sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 04.11.2016 09:04, Jacob Carlborg wrote:
 I have a file with a bunch of lines I want to process. I want to process
 these lines line by line. Most of these lines have the same pattern.
 Some of the lines have a different pattern. I want to bundle those
 lines, which have a non-standard pattern, together with the last line
 that had the standard pattern. The number of lines with a non-standard
 pattern is unknown. Are there some algorithms in Phobos that can help
 with this?

 Maybe an algorithm combining "chunkBy" and "until" could do it?

 Currently I'm using a standard for loop iterating over the lines. I'm
 always looking at the current line and the next line. When the current
 line is the standard pattern and the next line is is not, I do a
 separate loop until I see a standard pattern again, collecting the lines
 with the non-standard pattern in an array.
"chunkBy" a predicate that checks whether a line is standard. Use 'zip' to focus two adjacent chunks at the same time. Use 'filter' to only consider adjacent chunks where the first chunk consists of standard lines. Then extract the last line of the first chunk and combine it with the second chunk. import std.algorithm, std.range, std.typecons; import std.stdio; void main(){ auto data=["standard1","standard2","non-standard1","standard3", "non-standard2","non-standard3","standard4"]; static bool isStandard(string s){ return s.startsWith("standard"); } auto chunks=data.chunkBy!isStandard; auto pairs=zip(chunks.save,chunks.dropOne); auto result=pairs.filter!(x=>x[0][0]) .map!(x=>tuple(last(x[0][1]),x[1][1])); result.each!(x=>writeln(x[0],", (",x[1].joiner(", "),")")); } auto last(R)(R r){ // missing from Phobos AFAIK return zip(r.save,r.dropOne.recurrence!"a[n-1].dropOne" .until!(x=>x.empty)) .filter!(x=>x[1].empty).front[0]; } Prints: standard2, (non-standard1) standard3, (non-standard2, non-standard3)
Nov 05 2016
parent Jacob Carlborg <doob me.com> writes:
On 2016-11-05 14:57, Timon Gehr wrote:

 "chunkBy" a predicate that checks whether a line is standard. Use 'zip'
 to focus two adjacent chunks at the same time. Use 'filter' to only
 consider adjacent chunks where the first chunk consists of standard
 lines. Then extract the last line of the first chunk and combine it with
 the second chunk.

 import std.algorithm, std.range, std.typecons;
 import std.stdio;

 void main(){
     auto data=["standard1","standard2","non-standard1","standard3",
                "non-standard2","non-standard3","standard4"];
     static bool isStandard(string s){
         return s.startsWith("standard");
     }
     auto chunks=data.chunkBy!isStandard;
     auto pairs=zip(chunks.save,chunks.dropOne);
     auto result=pairs.filter!(x=>x[0][0])
         .map!(x=>tuple(last(x[0][1]),x[1][1]));
     result.each!(x=>writeln(x[0],", (",x[1].joiner(", "),")"));
 }

 auto last(R)(R r){ // missing from Phobos AFAIK
     return zip(r.save,r.dropOne.recurrence!"a[n-1].dropOne"
                .until!(x=>x.empty))
         .filter!(x=>x[1].empty).front[0];
 }

 Prints:
 standard2, (non-standard1)
 standard3, (non-standard2, non-standard3)
Wow, thanks. I have to take a closer look at this to understand the code above. What if I want to include all elements, i.e. "standard1" and "standard4" in the above example? -- /Jacob Carlborg
Nov 05 2016