www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - avoid toLower in std.algorithm.sort compare alias

reply "Jay Norwood" <jayn prismnet.com> writes:
While playing with sorting the unzip archive entries I tried use 
of the last example in 
http://dlang.org/phobos/std_algorithm.html#sort

std.algorithm.sort!("toLower(a.name) < 
toLower(b.name)",std.algorithm.SwapStrategy.stable)(entries);

It was terribly slow  for sorting the 34k entries in my test 
case.  I'd guess it is doing the toLower call on both strings for 
every compare.

It was much, much faster to add an entries.lowerCaseName = 
std.string.toLower(entries.name) foreach entry prior to the sort 
execution and then use

std.algorithm.sort!("a.lowerCaseName < b.lowerCaseName 
",std.algorithm.SwapStrategy.stable)(entries);

The difference was on the order of 10 secs vs no noticeable delay 
when executing the sort operation in the debugger.
Apr 21 2012
next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, April 22, 2012 01:24:56 Jay Norwood wrote:
 While playing with sorting the unzip archive entries I tried use
 of the last example in
 http://dlang.org/phobos/std_algorithm.html#sort
 
 std.algorithm.sort!("toLower(a.name) <
 toLower(b.name)",std.algorithm.SwapStrategy.stable)(entries);
 
 It was terribly slow  for sorting the 34k entries in my test
 case.  I'd guess it is doing the toLower call on both strings for
 every compare.
 
 It was much, much faster to add an entries.lowerCaseName =
 std.string.toLower(entries.name) foreach entry prior to the sort
 execution and then use
 
 std.algorithm.sort!("a.lowerCaseName < b.lowerCaseName
 ",std.algorithm.SwapStrategy.stable)(entries);
 
 The difference was on the order of 10 secs vs no noticeable delay
 when executing the sort operation in the debugger.
Yeah. toLower would be called on both strings on _every_ compare. And since that involves a loop, that would make the overall call to sort an order of magnitude worse than if you didn't call toLower at all. I'm not sure if it's an order of magnitude worse than your solution though, since it's been a while since I studied Big(O) notation (doing the conversion only once is still more expensive than not converting, but I'm not sure how much more expensive - it might cost less than sort such that it actually doesn't matter as for as Big(O) goes though). - Jonathan M Davis
Apr 21 2012
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Jonathan M Davis:

 I'm not sure if it's 
 an order of magnitude worse than your solution though, since it's been a while 
 since I studied Big(O) notation (doing the conversion only once is still more 
 expensive than not converting, but I'm not sure how much more expensive - it 
 might cost less than sort such that it actually doesn't matter as for as 
 Big(O) goes though).
Performing the toLower every time the cmp function is called doesn't change the O complexity. In Phobos there is an alternative sorting (Schwartzian sorting routime) that applies a function to each item before sorting them, usually is much slower than the normal sorting, but maybe this time it's convenient. The performance improvement in the OP message is large, maybe it's a problem of memory allocations of the converted strings... More info on the original code is needed to give better answers. Bye, bearophile
Apr 21 2012
next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday, April 21, 2012 20:36:18 bearophile wrote:
 Jonathan M Davis:
 I'm not sure if it's
 an order of magnitude worse than your solution though, since it's been a
 while since I studied Big(O) notation (doing the conversion only once is
 still more expensive than not converting, but I'm not sure how much more
 expensive - it might cost less than sort such that it actually doesn't
 matter as for as Big(O) goes though).
Performing the toLower every time the cmp function is called doesn't change the O complexity.
Yes it does. It adds a loop to each comparison (two loops actually, but since they're not nested, Big(O) only cares about the one), since toLower has to loop over all of the characters. As sort loops over each of the strings to compare them for moving them into a sorted position or not, it calls toLower, which adds a nested loop, so it increases the Big(O) complexity. Something like this foreach(str; strings) str < otherStr; becomes foreach(str; strings) { foreach(dchar c; str) //lower characters foreach(dchar c; otherStr) //lower characters str < otherStr; } though that's obviously very pseudo-code-ish and not exactly what sort does. Regardless, those extra loops when the comparison happens are nested and therefore increase the Big(O) complexity of the overall algorithm. - Jonathan M Davis
Apr 21 2012
prev sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Apr 21, 2012 at 05:45:35PM -0700, Jonathan M Davis wrote:
 On Saturday, April 21, 2012 20:36:18 bearophile wrote:
 Jonathan M Davis:
 I'm not sure if it's an order of magnitude worse than your
 solution though, since it's been a while since I studied Big(O)
 notation (doing the conversion only once is still more expensive
 than not converting, but I'm not sure how much more expensive - it
 might cost less than sort such that it actually doesn't matter as
 for as Big(O) goes though).
Performing the toLower every time the cmp function is called doesn't change the O complexity.
Yes it does. It adds a loop to each comparison (two loops actually, but since they're not nested, Big(O) only cares about the one), since toLower has to loop over all of the characters. As sort loops over each of the strings to compare them for moving them into a sorted position or not, it calls toLower, which adds a nested loop, so it increases the Big(O) complexity. Something like this foreach(str; strings) str < otherStr; becomes foreach(str; strings) { foreach(dchar c; str) //lower characters foreach(dchar c; otherStr) //lower characters str < otherStr; } though that's obviously very pseudo-code-ish and not exactly what sort does. Regardless, those extra loops when the comparison happens are nested and therefore increase the Big(O) complexity of the overall algorithm.
[...] Actually, I don't think the nested loops affect Big-O complexity at all. The O(l) complexity (where l = string length) is already inherent in the string comparison "str < otherStr". Adding more loops over the strings doesn't change the Big-O complexity (you just make O(l) into 3*O(l) which is the same as O(l)). However, always remember that Big-O notation hides constant factors and terms. These factors are negligible given a sufficiently large problem size, but for small problem sizes, the hidden constants are very much significant. An O(n log n) algorithm may actually take n*log(10*n) steps or 1000*n*log(n) steps; for large enough n, they're approximately the same, but when n is small, the 1000 has a very significant hit on observed performance. That's why optimizing the constant factors is sometimes necessary (provided you've already minimized the big-O complexity to the full, since otherwise you're just pinching pennies yet freely spending $100 bills). Inner loop optimization, like strength reduction, loop invariant hoisting, etc., are examples of where constant factors get optimized. If you have a loop: real x; for (i=0; i < n; i++) { // bigComplexCalculation is independent of i real n = bigComplexCalculation(x); // make use of n in some way } Then moving the bigComplexCalculation() call outside the loop improves the observed performance significantly, even though you're not changing the big-O complexity: a small change from 10*n to 8*n means a 20% improvement in observed performance, even though the algorithm still degrades linearly with problem size just like before. T -- Microsoft is to operating systems & security ... what McDonalds is to gourmet cooking.
Apr 21 2012
prev sibling parent reply "Jay Norwood" <jayn prismnet.com> writes:
On Saturday, 21 April 2012 at 23:54:26 UTC, Jonathan M Davis 
wrote:
 Yeah. toLower would be called on both strings on _every_ 
 compare. And since
 that involves a loop, that would make the overall call to sort 
 an order of
 magnitude worse than if you didn't call toLower at all. I'm not 
 sure if it's
 an order of magnitude worse than your solution though, since 
 it's been a while
 since I studied Big(O) notation (doing the conversion only once 
 is still more
 expensive than not converting, but I'm not sure how much more 
 expensive - it
 might cost less than sort such that it actually doesn't matter 
 as for as
 Big(O) goes though).

 - Jonathan M Davis
use of toLower in the sort expression adds around 11.2 secs overhead to a 0.3 sec operation which reads and sorts the 34k directory entries in this 2GB layout on the ssd drive, so it isn't an option for me. finished! time:312 ms finished! time:11598 ms std.algorithm.sort!("toLower(a) < toLower(b)",std.algorithm.SwapStrategy.stable)(dirs); //std.algorithm.sort!("a < b", std.algorithm.SwapStrategy.stable)(dirs);
Apr 21 2012
parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, April 22, 2012 03:47:30 Jay Norwood wrote:
 On Saturday, 21 April 2012 at 23:54:26 UTC, Jonathan M Davis
 
 wrote:
 Yeah. toLower would be called on both strings on _every_
 compare. And since
 that involves a loop, that would make the overall call to sort
 an order of
 magnitude worse than if you didn't call toLower at all. I'm not
 sure if it's
 an order of magnitude worse than your solution though, since
 it's been a while
 since I studied Big(O) notation (doing the conversion only once
 is still more
 expensive than not converting, but I'm not sure how much more
 expensive - it
 might cost less than sort such that it actually doesn't matter
 as for as
 Big(O) goes though).
 
 - Jonathan M Davis
use of toLower in the sort expression adds around 11.2 secs overhead to a 0.3 sec operation which reads and sorts the 34k directory entries in this 2GB layout on the ssd drive, so it isn't an option for me. finished! time:312 ms finished! time:11598 ms std.algorithm.sort!("toLower(a) < toLower(b)",std.algorithm.SwapStrategy.stable)(dirs); //std.algorithm.sort!("a < b", std.algorithm.SwapStrategy.stable)(dirs);
It wasn't saying that it _was_ an option. I was just trying to point out the likely reason why it's so bad with toLower - algorithmically-speaking. This definitely appears to be a case where doing some extra computation ahead of time will save you a lot later. - Jonathan M Davis
Apr 21 2012
prev sibling next sibling parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 04/21/2012 04:24 PM, Jay Norwood wrote:
 While playing with sorting the unzip archive entries I tried use of the
 last example in http://dlang.org/phobos/std_algorithm.html#sort

 std.algorithm.sort!("toLower(a.name) <
 toLower(b.name)",std.algorithm.SwapStrategy.stable)(entries);
Stealing this thread to point out that converting a letter to upper or lower case cannot be done without knowing the writing system. Phobos's toLower() documentation currently says: "Returns a string which is identical to s except that all of its characters are lowercase (in unicode, not just ASCII)." Unicode cannot define the conversions of at least the following letters without knowing the actual alphabet that the text is written in: - Lowercase of I is ı in some alphabets[*] and i in many others. - Uppercase of i is İ in some alphabets[*] and I in many others. Ali [*] Turkish, Azeri, Chrimean Tatar, Gagauz, Celtic, etc.
Apr 21 2012
next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday, April 21, 2012 18:43:23 Ali =C3=87ehreli wrote:
 On 04/21/2012 04:24 PM, Jay Norwood wrote:
  > While playing with sorting the unzip archive entries I tried use o=
f the
  > last example in http://dlang.org/phobos/std_algorithm.html#sort
  >=20
  > std.algorithm.sort!("toLower(a.name) <
  > toLower(b.name)",std.algorithm.SwapStrategy.stable)(entries);
=20
 Stealing this thread to point out that converting a letter to upper o=
r
 lower case cannot be done without knowing the writing system. Phobos'=
s
 toLower() documentation currently says: "Returns a string which is
 identical to s except that all of its characters are lowercase (in
 unicode, not just ASCII)."
=20
 Unicode cannot define the conversions of at least the following lette=
rs
 without knowing the actual alphabet that the text is written in:
=20
 - Lowercase of I is =C4=B1 in some alphabets[*] and i in many others.=
=20
 - Uppercase of i is =C4=B0 in some alphabets[*] and I in many others.=
=20
 Ali
=20
 [*] Turkish, Azeri, Chrimean Tatar, Gagauz, Celtic, etc.
toLower and toUpper get pretty screwing with unicode. I don't know enou= gh=20 about non-English alphabets to know what affects what, but at minimum, = there=20 are a number of cases where toLower does not reverse toUpper (and vice = versa).=20 Rather, it converts the character into yet another letter. So, toLower = to=20 toUpper with unicode and definitely a bit iffy. I suppose that they do = the job=20 if you call them enough on the string that it doesn't change anymore, b= ut I=20 don't know. I also don't know how they act with regards to the various alphabets an= d how=20 their implementation was decided upon. IIRC, Walter wrote them, and I'm= sure=20 that they're based on the unicode standard, but what that amounts to, I= don't=20 know. - Jonathan M Davis
Apr 21 2012
prev sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 22.04.2012 5:43, Ali Çehreli wrote:
 On 04/21/2012 04:24 PM, Jay Norwood wrote:
  > While playing with sorting the unzip archive entries I tried use of the
  > last example in http://dlang.org/phobos/std_algorithm.html#sort
  >
  > std.algorithm.sort!("toLower(a.name) <
  > toLower(b.name)",std.algorithm.SwapStrategy.stable)(entries);

 Stealing this thread to point out that converting a letter to upper or
 lower case cannot be done without knowing the writing system. Phobos's
 toLower() documentation currently says: "Returns a string which is
 identical to s except that all of its characters are lowercase (in
 unicode, not just ASCII)."
Oh, come on. This function wasn't updated for ages. I bet this wording here is intact since unicode 4.0 ;)
 Unicode cannot define the conversions of at least the following letters
 without knowing the actual alphabet that the text is written in:

 - Lowercase of I is ı in some alphabets[*] and i in many others.

 - Uppercase of i is İ in some alphabets[*] and I in many others.
Fair point. The list however is not that long and a system may choose to support this or not (changing behavior based on writing system is called tailoring I believe).
 Ali

 [*] Turkish, Azeri, Chrimean Tatar, Gagauz, Celtic, etc.
-- Dmitry Olshansky
Apr 22 2012
prev sibling next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday, April 21, 2012 18:26:42 H. S. Teoh wrote:
 Actually, I don't think the nested loops affect Big-O complexity at all.
 The O(l) complexity (where l = string length) is already inherent in the
 string comparison "str < otherStr". Adding more loops over the strings
 doesn't change the Big-O complexity (you just make O(l) into 3*O(l)
 which is the same as O(l)).
If the loop is really nested, then it will increase the Big(O) complexity, whereas if it's parallel to a similar loop, it will just increase the constant factor. Which it really is in this case, I don't know without sitting down and sketching out the exact algorithm, but certainly upon a first glance, it was my conclusion that the loops were nested. If they're not, then they're not. Regardless of whether it's the Big(O) complexity or the constant factor that's the problem here, clearly there's enough additional overhead that it's causing problems for Jay's particular case. It's also the sort of thing that can be easy to miss and then end up wondering why your code is so slow (if it actually matters in your particular situation). - Jonathan M Davis
Apr 21 2012
parent reply "Jay Norwood" <jayn prismnet.com> writes:
On Sunday, 22 April 2012 at 02:29:45 UTC, Jonathan M Davis wrote:

 Regardless of whether it's the Big(O) complexity or the 
 constant factor that's
 the problem here, clearly there's enough additional overhead 
 that it's causing
 problems for Jay's particular case. It's also the sort of thing 
 that can be
 easy to miss and then end up wondering why your code is so slow 
 (if it
 actually matters in your particular situation).

 - Jonathan M Davis
I haven't looked at strncmpi code, but I suspect it is a lot more efficient. For example, in comparing AbbbCdddEfffXabcdEfgh AbbbCdddEfffYabcdEfgh it is not necessary to do case conversion on anything except X and Y, and if isUpper(X)==isUpper(Y) then X and Y can be compared without conversion, and since X and Y are not equal the remaining characters don't have to be converted. The comment below worries me a little bit about std.string.icmp. What if they are two 1MB strings that differ in he first character? Does it really convert both strings to lower case before comparing the first character? http://dlang.org/phobos/std_string.html#icmp "Technically, icmp(r1, r2) is equivalent to cmp!"std.uni.toLower(a) < std.uni.toLower(b)"(r1, r2). "
Apr 21 2012
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, April 22, 2012 08:20:13 Jay Norwood wrote:
 The comment below worries me a little bit about std.string.icmp.
 What if they are two 1MB strings that differ in he first
 character?  Does it really convert both strings to lower case
 before comparing the first character?
 
 http://dlang.org/phobos/std_string.html#icmp
 
 "Technically, icmp(r1, r2) is equivalent to
 cmp!"std.uni.toLower(a) < std.uni.toLower(b)"(r1, r2). "
You can look at the code. It checks each of the characters in place. Unlike toLower, it doesn't need to generate a new string. But as far as the comparison goes, they're the same - hence that line in the docs. - Jonathan M Davis
Apr 21 2012
parent reply "Jay Norwood" <jayn prismnet.com> writes:
On Sunday, 22 April 2012 at 06:26:42 UTC, Jonathan M Davis wrote:
 You can look at the code. It checks each of the characters in 
 place. Unlike
 toLower, it doesn't need to generate a new string. But as far 
 as the
 comparison goes, they're the same - hence that line in the docs.

 - Jonathan M Davis
ok, I did look at the code just now, and I'll sleep better knowing that it doesn't do the whole string conversion. I misunderstood your pseudo-code to mean that two lower case strings were being created prior to the compare. However, icmp code does appear to call the toLower conversion on both characters without first comparing the characters for equality, which misses the chance to do a simple compare that would avoid the two calls.
Apr 22 2012
parent Marco Leise <Marco.Leise gmx.de> writes:
Am Sun, 22 Apr 2012 09:23:45 +0200
schrieb "Jay Norwood" <jayn prismnet.com>:

 On Sunday, 22 April 2012 at 06:26:42 UTC, Jonathan M Davis wrote:
 You can look at the code. It checks each of the characters in 
 place. Unlike
 toLower, it doesn't need to generate a new string. But as far 
 as the
 comparison goes, they're the same - hence that line in the docs.

 - Jonathan M Davis
ok, I did look at the code just now, and I'll sleep better knowing that it doesn't do the whole string conversion. I misunderstood your pseudo-code to mean that two lower case strings were being created prior to the compare. However, icmp code does appear to call the toLower conversion on both characters without first comparing the characters for equality, which misses the chance to do a simple compare that would avoid the two calls.
/----- check for equality :) v cmp!"a != b && std.uni.toLower(a) < std.uni.toLower(b)"(r1, r2) -- Marco
Apr 25 2012
prev sibling next sibling parent Jacob Carlborg <doob me.com> writes:
On 2012-04-22 01:24, Jay Norwood wrote:
 While playing with sorting the unzip archive entries I tried use of the
 last example in http://dlang.org/phobos/std_algorithm.html#sort

 std.algorithm.sort!("toLower(a.name) <
 toLower(b.name)",std.algorithm.SwapStrategy.stable)(entries);

 It was terribly slow for sorting the 34k entries in my test case. I'd
 guess it is doing the toLower call on both strings for every compare.

 It was much, much faster to add an entries.lowerCaseName =
 std.string.toLower(entries.name) foreach entry prior to the sort
 execution and then use

 std.algorithm.sort!("a.lowerCaseName < b.lowerCaseName
 ",std.algorithm.SwapStrategy.stable)(entries);

 The difference was on the order of 10 secs vs no noticeable delay when
 executing the sort operation in the debugger.
Perhaps a function that does case folding would be better in this case. But as far as I know Phobos doesn't have a function for that. -- /Jacob Carlborg
Apr 22 2012
prev sibling next sibling parent reply "SomeDude" <lovelydear mailmetrash.com> writes:
On Saturday, 21 April 2012 at 23:24:57 UTC, Jay Norwood wrote:
 While playing with sorting the unzip archive entries I tried 
 use of the last example in 
 http://dlang.org/phobos/std_algorithm.html#sort

 std.algorithm.sort!("toLower(a.name) < 
 toLower(b.name)",std.algorithm.SwapStrategy.stable)(entries);

 It was terribly slow  for sorting the 34k entries in my test 
 case.  I'd guess it is doing the toLower call on both strings 
 for every compare.

 It was much, much faster to add an entries.lowerCaseName = 
 std.string.toLower(entries.name) foreach entry prior to the 
 sort execution and then use

 std.algorithm.sort!("a.lowerCaseName < b.lowerCaseName 
 ",std.algorithm.SwapStrategy.stable)(entries);

 The difference was on the order of 10 secs vs no noticeable 
 delay when executing the sort operation in the debugger.
Good point. Perhaps this should be added in the documentation of std.algorithm ? It's easy to get trapped.
Apr 22 2012
parent "Jay Norwood" <jayn prismnet.com> writes:
On Sunday, 22 April 2012 at 00:36:19 UTC, bearophile wrote:
 Performing the toLower every time the cmp function is called 
 doesn't change the O complexity. In Phobos there is an 
 alternative sorting (Schwartzian sorting routime) that applies 
 a function to each item before sorting them, usually is much 
 slower than the normal sorting, but maybe this time it's 
 convenient.
 Bye,
 bearophile
The shwartzSort works fine. Thanks, Jay std.algorithm.schwartzSort!(std.string.toLower, "a < b")(dirs); G:\d\rmdir2\rmdir2\rmdir2\Debug>rmdir2 removing: g:\tz finished! time:699 ms
Apr 22 2012
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 21 Apr 2012 19:24:56 -0400, Jay Norwood <jayn prismnet.com> wrote:

 While playing with sorting the unzip archive entries I tried use of the  
 last example in http://dlang.org/phobos/std_algorithm.html#sort

 std.algorithm.sort!("toLower(a.name) <  
 toLower(b.name)",std.algorithm.SwapStrategy.stable)(entries);

 It was terribly slow  for sorting the 34k entries in my test case.  I'd  
 guess it is doing the toLower call on both strings for every compare.

 It was much, much faster to add an entries.lowerCaseName =  
 std.string.toLower(entries.name) foreach entry prior to the sort  
 execution and then use

 std.algorithm.sort!("a.lowerCaseName < b.lowerCaseName  
 ",std.algorithm.SwapStrategy.stable)(entries);

 The difference was on the order of 10 secs vs no noticeable delay when  
 executing the sort operation in the debugger.
I'll point out what I haven't seen yet: the issue is not so much toLower being called on every comparison, but more that toLower allocates (and then throws away!). I think using std.string.icmp is the best solution. I would expect it to outperform even schwartz sort. Note, to answer your question elsewhere, the comment is accurate, std.uni.toLower(a) is a function that accepts a dchar, not a string. What the comment is saying is that for the "ranges" (i.e. strings) given, it runs the given comparison on the std.uni.toLower() result for each element (i.e. dchar). -Steve
Apr 23 2012
parent reply "Jay Norwood" <jayn prismnet.com> writes:
On Monday, 23 April 2012 at 11:27:40 UTC, Steven Schveighoffer
wrote:

 I think using std.string.icmp is the best solution.  I would 
 expect it to outperform even schwartz sort.

 -Steve
icmp took longer... added about 1 sec vs 0.3 sec (for schwartzSort ) to the program execution time. bool myComp(string x, string y) { return std.string.icmp(x,y)<0; } std.algorithm.sort!(myComp)(dirs); finished! time:1396 ms
Apr 23 2012
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 23 Apr 2012 09:49:50 -0400, Jay Norwood <jayn prismnet.com> wrote:

 On Monday, 23 April 2012 at 11:27:40 UTC, Steven Schveighoffer
 wrote:

 I think using std.string.icmp is the best solution.  I would expect it  
 to outperform even schwartz sort.

 -Steve
icmp took longer... added about 1 sec vs 0.3 sec (for schwartzSort ) to the program execution time. bool myComp(string x, string y) { return std.string.icmp(x,y)<0; } std.algorithm.sort!(myComp)(dirs); finished! time:1396 ms
Well, that's surprising :) Perhaps there's some room for improvement in icmp or std.uni.toLower. There may be some constructs that are preventing inlining (enforce is the worst offender). While dealing with unicode in my std.stream rewrite, I've found that hand-decoding dchars is way faster than using library calls. -Steve
Apr 23 2012
parent reply "Regan Heath" <regan netmail.co.nz> writes:
On Mon, 23 Apr 2012 16:43:20 +0100, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 While dealing with unicode in my std.stream rewrite, I've found that  
 hand-decoding dchars is way faster than using library calls.
After watching Andrei's talk on generic and generative programming I have to ask, which routines are you avoiding .. it seems we need to make them as good as the hand coded code you've written... R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Apr 24 2012
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tuesday, 24 April 2012 at 11:24:44 UTC, Regan Heath wrote:
 On Mon, 23 Apr 2012 16:43:20 +0100, Steven Schveighoffer 
 <schveiguy yahoo.com> wrote:

 While dealing with unicode in my std.stream rewrite, I've 
 found that hand-decoding dchars is way faster than using 
 library calls.
After watching Andrei's talk on generic and generative programming I have to ask, which routines are you avoiding .. it seems we need to make them as good as the hand coded code you've written...
from memory (don't have the code in front of me right now), it was std.uni.decode, and using foreach(dchar d; str) (which cannot be inlined currently). IIRC, std.uni.decode was not being inlined. So I tried hand-inlining it (I also discovered some optimizations it was not using), and it made a huge difference. In regards to this discussion, I think icmp can also be improved when run on a char array, by doing a byte comparison (no dchar decoding) until it finds a difference. That might be a huge speedup. Right now, all dchars are being decoded, and translated to the toLower counterpart. It may have an opposite effect, however, if there are a lot of strings that are equivalent when ignoring case, but not exactly the same. -Steve
Apr 24 2012
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tuesday, 24 April 2012 at 14:54:48 UTC, Steven Schveighoffer 
wrote:
 On Tuesday, 24 April 2012 at 11:24:44 UTC, Regan Heath wrote:
 After watching Andrei's talk on generic and generative 
 programming I have to ask, which routines are you avoiding .. 
 it seems we need to make them as good as the hand coded code 
 you've written...
from memory (don't have the code in front of me right now), it was std.uni.decode, and using foreach(dchar d; str) (which cannot be inlined currently). IIRC, std.uni.decode was not being inlined. So I tried hand-inlining it (I also discovered some optimizations it was not using), and it made a huge difference.
BTW, you can check out my github branch of phobos named new-io2, look at the textstream struct to see what I've inlined. -Steve
Apr 24 2012
prev sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Tuesday, April 24, 2012 12:24:44 Regan Heath wrote:
 On Mon, 23 Apr 2012 16:43:20 +0100, Steven Schveighoffer
 
 <schveiguy yahoo.com> wrote:
 While dealing with unicode in my std.stream rewrite, I've found that
 hand-decoding dchars is way faster than using library calls.
After watching Andrei's talk on generic and generative programming I have to ask, which routines are you avoiding .. it seems we need to make them as good as the hand coded code you've written...
In general, when operating on strings generically, you up having to treat them as ranges of dchar and decode everything, but there are a lot of cases where you can special-case algorithms for narrow strings and avoid decoding them. Phobos does this a lot (though it can probably do a better job of it in a number of places), so by using functions from there rather than rolling your own, the problem is reduced, but any time that you're doing a lot of generic string processing, there's a decent chance that you're going to have to special case some stuff for arrays of char, wchar, and dchar in order to fully optimize it. And I don't think that there's really a way out of that beyond having a lot of functions already available (and already optimized) to do a lot of the string processing for you. There's a definite tension between genericity and effciency in the case of string processing - due primarily to variable length encodings. - Jonathan M Davis
Apr 24 2012