digitalmars.D - Why does std.string.splitLines return an array?
- Chad J (3/3) Oct 21 2012 std.string.splitLines returns an array, which is pretty grody. Why not
- bearophile (17/20) Oct 21 2012 splitLines is probably modeled on the str.splitlines() string
- Jonathan M Davis (4/7) Oct 21 2012 If you want a lazy range, then use std.algorithm.splitter. std.string
- Chad J (27/34) Oct 21 2012 std.algorithm.splitter is simply not acceptable for this. It doesn't
- Andrei Alexandrescu (6/31) Oct 22 2012 Agreed. We should add splitter() accepting only one argument of some
std.string.splitLines returns an array, which is pretty grody. Why not return a lazily-evaluated range struct so that we can avoid allocations on this simple but common operation?
Oct 21 2012
Chad J:std.string.splitLines returns an array, which is pretty grody. Why not return a lazily-evaluated range struct so that we can avoid allocations on this simple but common operation?splitLines is probably modeled on the str.splitlines() string method of Python, that returns a list (array) of strings (because originally Python was eager). In Phobos there is both a split() and splitter(), they are eager and lazy. So maybe you want a splitterLines(). I have asked for a lazy splitLines, vote here: http://d.puremagic.com/issues/show_bug.cgi?id=4764 But I have suggested for a different naming: http://d.puremagic.com/issues/show_bug.cgi?id=5838 See also: http://d.puremagic.com/issues/show_bug.cgi?id=6730 http://d.puremagic.com/issues/show_bug.cgi?id=7689 And especially: http://d.puremagic.com/issues/show_bug.cgi?id=8013 Bye, bearophile
Oct 21 2012
On Sun, 2012-10-21 at 18:00 -0400, Chad J wrote:std.string.splitLines returns an array, which is pretty grody. Why not return a lazily-evaluated range struct so that we can avoid allocations on this simple but common operation?If you want a lazy range, then use std.algorithm.splitter. std.string operates on and returns strings, not general ranges. - Jonathan M Davis
Oct 21 2012
On 10/21/2012 06:35 PM, Jonathan M Davis wrote:On Sun, 2012-10-21 at 18:00 -0400, Chad J wrote:std.algorithm.splitter is simply not acceptable for this. It doesn't have this kind of logic: bool matchLineEnd( string text, size_t pos ) { if ( pos+1 < text.length && text[pos] == '\r' && text[pos+1] == '\n' ) return true; else if ( pos < text.length && (text[pos] == '\r' || text[pos] == '\n') ) return true; else return false; } I've never used std.algorithm.splitter for line splitting, despite trying. It's always more effective to write your own. I'm with bearophile on this one: http://d.puremagic.com/issues/show_bug.cgi?id=4764 I think his suggestions about naming also just make *sense*. I'm not sure how practical some of those naming changes would be if there is a lot of wild D2 code that uses the current weirdly-named stuff that emphasizes eager evaluation and extraneous allocations. I'm not sure how necessary it is to even /have/ functions that return arrays when there are lazy versions: the result of a lazy function can always be fed to std.array.array(range). Heh, even parentheses nesting is nicely handled by UFCS now.std.string.splitLines returns an array, which is pretty grody. Why not return a lazily-evaluated range struct so that we can avoid allocations on this simple but common operation?If you want a lazy range, then use std.algorithm.splitter. std.string operates on and returns strings, not general ranges. - Jonathan M Davis
Oct 21 2012
On 10/22/12 1:05 AM, Chad J wrote:On 10/21/2012 06:35 PM, Jonathan M Davis wrote:Agreed. We should add splitter() accepting only one argument of some string type. It would use the line splitting logic above. Could you please adapt your code to do this and package it in a pull request? Thanks! AndreiOn Sun, 2012-10-21 at 18:00 -0400, Chad J wrote:std.algorithm.splitter is simply not acceptable for this. It doesn't have this kind of logic: bool matchLineEnd( string text, size_t pos ) { if ( pos+1 < text.length && text[pos] == '\r' && text[pos+1] == '\n' ) return true; else if ( pos < text.length && (text[pos] == '\r' || text[pos] == '\n') ) return true; else return false; }std.string.splitLines returns an array, which is pretty grody. Why not return a lazily-evaluated range struct so that we can avoid allocations on this simple but common operation?If you want a lazy range, then use std.algorithm.splitter. std.string operates on and returns strings, not general ranges. - Jonathan M Davis
Oct 22 2012