digitalmars.D.learn - [OT] Is this more readable, or just way too verbose?
- simendsjo (72/72) Aug 09 2010 I took splitlines from std.string, which is a simple, short method.
- bearophile (4/6) Aug 09 2010 Write a good enough benchmark and run it.
- Lutger (21/113) Aug 09 2010 The CR and LF constants are a bit too much, probably because they don't ...
I took splitlines from std.string, which is a simple, short method. S[] splitlines(S)(S s) { size_t istart; auto result = Appender!(S[])(); foreach (i; 0 .. s.length) { immutable c = s[i]; if (c == '\r' || c == '\n') { result.put(s[istart .. i]); istart = i + 1; if (c == '\r' && i + 1 < s.length && s[i + 1] == '\n') { i++; istart++; } } } if (istart != s.length) { result.put(s[istart .. $]); } return result.data; } I guess it takes less than 30 seconds to fully understand this method. Then I rap.. I mean refactored it to this: S[] mysplitlines(S)(S s) { const CR = '\r'; const LF = '\n'; size_t istart; auto result = Appender!(S[])(); foreach (i; 0 .. s.length) { immutable c = s[i]; immutable isCR = (c == CR); immutable isLF = (c == LF); if (isCR || isLF) { auto line = s[istart .. i]; result.put(line); istart = i + 1; // Might be CRLF. In that case we need to consume LF too if (isCR) { immutable hasMoreCharacters = (i + 1 < s.length); immutable nextIsLF = hasMoreCharacters && (s[i + 1] == LF); immutable isCRLF = isCR && nextIsLF; if (isCRLF) { i++; istart++; } } } } immutable lineNotEmpty = (istart != s.length); if (lineNotEmpty) { auto lastLine = s[istart .. $]; result.put(lastLine); } return result.data; } Yes, I'm reading some books now and don't have much experience :) It went from a small 26 line, very readable function to 48 lines (85% increase!) with many more temporary variables.. So... Do you think this kind of code is (more) readable, or just way too verbose and doing more harm than good? And will the compiler generate slower code, or should the optimizer be able to inline the temporaries?
Aug 09 2010
simendsjo:And will the compiler generate slower code, or should the optimizer be able to inline the temporaries?Write a good enough benchmark and run it. Bye, bearophile
Aug 09 2010
simendsjo wrote:I took splitlines from std.string, which is a simple, short method. S[] splitlines(S)(S s) { size_t istart; auto result = Appender!(S[])(); foreach (i; 0 .. s.length) { immutable c = s[i]; if (c == '\r' || c == '\n') { result.put(s[istart .. i]); istart = i + 1; if (c == '\r' && i + 1 < s.length && s[i + 1] == '\n') { i++; istart++; } } } if (istart != s.length) { result.put(s[istart .. $]); } return result.data; } I guess it takes less than 30 seconds to fully understand this method. Then I rap.. I mean refactored it to this: S[] mysplitlines(S)(S s) { const CR = '\r'; const LF = '\n'; size_t istart; auto result = Appender!(S[])(); foreach (i; 0 .. s.length) { immutable c = s[i]; immutable isCR = (c == CR); immutable isLF = (c == LF); if (isCR || isLF) { auto line = s[istart .. i]; result.put(line); istart = i + 1; // Might be CRLF. In that case we need to consume LF too if (isCR) { immutable hasMoreCharacters = (i + 1 < s.length); immutable nextIsLF = hasMoreCharacters && (s[i + 1] == LF); immutable isCRLF = isCR && nextIsLF; if (isCRLF) { i++; istart++; } } } } immutable lineNotEmpty = (istart != s.length); if (lineNotEmpty) { auto lastLine = s[istart .. $]; result.put(lastLine); } return result.data; } Yes, I'm reading some books now and don't have much experience :) It went from a small 26 line, very readable function to 48 lines (85% increase!) with many more temporary variables.. So... Do you think this kind of code is (more) readable, or just way too verbose and doing more harm than good?The CR and LF constants are a bit too much, probably because they don't really abstract over the literals which I can actually parse faster. The isCR and isLF are nice however. Taking it a step further: bool canSplit = inPattern(c,"\r\n"); if (canSplit) { ... You have increased the nesting of ifs to 3 inside a for-loop. Personally I don't read deep nesting very well. To go for readability I would use a small function for the entire expression: if( s[i..$].startsWithCRLF() ) // same as startsWithCRLF(s[i..$]) { i++; istart++; } or use std.algorithm: if ( s[i..$].startsWith("\r\n") )And will the compiler generate slower code, or should the optimizer be able to inline the temporaries?I don't think it matters much, but you can only tell by testing. There is a benchmark function (called benchmark iirc) somewhere in phobos to start with and dmd has a builtin profiler too.
Aug 09 2010
Lutger wrote:simendsjo wrote:(...)The CR and LF constants are a bit too much, probably because they don't really abstract over the literals which I can actually parse faster. The isCR and isLF are nice however. Taking it a step further: bool canSplit = inPattern(c,"\r\n"); if (canSplit) { ... You have increased the nesting of ifs to 3 inside a for-loop.Personally I don't read deep nesting very well. To go for readability I would use a small function for the entire expression: if( s[i..$].startsWithCRLF() ) // same as startsWithCRLF(s[i..$]) { i++; istart++; } or use std.algorithm: if ( s[i..$].startsWith("\r\n") )(...) Nice. I didn't increase the if nesting though. Something like this then? S[] mysplitlines(S)(S s) { size_t istart; auto result = Appender!(S[])(); foreach (i; 0 .. s.length) { immutable c = s[i]; immutable isEOL = inPattern(c, "\r\n"); if (isEOL) { auto beforeEOL = s[istart .. i]; result.put(beforeEOL); auto rest = s[i .. $]; immutable isCRLF = rest.startsWith("\r\n"); istart = i + 1; // consume first EOL character if (isCRLF) // skip \n too { i++; istart++; } } } // The last line might not end with EOL immutable lineNotEmpty = (istart != s.length); if (lineNotEmpty) { auto lastLine = s[istart .. $]; result.put(lastLine); } return result.data; }
Aug 10 2010
simendsjo wrote:Lutger wrote:...I didn't increase the if nesting though.I count 2 nested if-statements inside of the foreach loop in the original, you have 3 nested if-statements.Something like this then?Looks good to me, yes.
Aug 10 2010