digitalmars.D - Compiler benchmarks for an alternative to std.uni.asLowerCase.

Jon D (71/71) May 08 2016 I did a performance study on speeding up case conversion in

Peter =?UTF-8?B?SMOkZ2dtYW4=?= (3/9) May 08 2016 Nice, it seems that you would have enough material to advocate a

Jon D (7/17) May 08 2016 Thanks! I haven't yet taken the time to go through the 'becoming

Lionello Lunesu (4/20) May 09 2016 Less code, and faster, and using existing building blocks. Win, win,

Dmitry Olshansky (7/18) May 09 2016 The only problem is that it should consider multi-codepoint replacements...

Marc =?UTF-8?B?U2Now7x0eg==?= (3/9) May 11 2016 What about locale dependent case mappings (e.g. Turkish Iı İi)?

Marco Leise (24/34) May 12 2016 In case someone wonders, an example for multi-codepoint

Dmitry Olshansky (9/27) May 12 2016 Proper handling of that is called tailoring in Unicode. Personally I

qznc (9/15) May 09 2016 This makes me suspicious, since you did not measure standard

Jon D <jond noreply.com> writes:

I did a performance study on speeding up case conversion in 
std.uni.asLowerCase. Specifics for asLowerCase have been added to 
issue https://issues.dlang.org/show_bug.cgi?id=11229. Publishing 
here as some of the more general observations may be of wider 
interest.

Background - Case conversion can generally be sped up by checking 
if a character is ascii before invoking a full unicode case 
conversion. The single character std.uni.toLower does this 
optimization, but std.uni.asLowerCase does not. asLowerCase does 
a lazy conversion of a range. For the test, I created a 
replacement for asLowerCase which uses map and toLower. In 
essence, `map!(x => x.toLower)` or `map!(x => x.byDchar.toLower)`.

Testing was with DMD (2.071) and LDC 1.0.0-beta1 (Phobos 2.070) 
on OSX. Compiler settings were `-release -O -boundscheck=off`. 
DMD was tested with and without `-inline`. LDC turns on inlining 
(-enable-inlining=1) by default with -O, but DMD does not. Texts 
tried were in Japanese, Chinese, Finnish, English, German, and 
Spanish. Timing was done both including and excluding decoding 
from utf-8 to dchar.

Performance delta including decoding to dchar:
   | Language group  | Pct Ascii | LDC gain   | DMD gain  | DMD no 
inline  |
   
|-----------------+-----------+------------+-----------+----------------|
   | Latin           |    95-99% | 64% (2.7x) | 93% (14x) | 48% 
(1.9x)     |
   | Asian (Jpn/Chn) |  2.4-3.7% | 36% (1.6x) | 80% (5x)  | -1%

Performance delta excluding decoding to dchar:
   | Language group  | Pct Ascii | LDC gain   | DMD gain  | DMD no 
inline |
   
|-----------------+-----------+------------+-----------+---------------|
   | Latin           |    95-99% | 60% (2.5x) | 95% (20x) | 60% 
(2.5x)    |
   | Asian (Jpn/Chn) |  2.4-3.7% | 50% (2x)   | 95% (20x) | -2%

Observations:
* mapAsLowerCase was faster than asLowerCase across the board. 
That it was better for Asian texts suggests the improvement 
involved more just the ascii check optimization.
* Performance varied widely between compilers, and for DMD, 
whether the -inline flag was included. The performance delta 
between asLowerCase and the mapAsLowerCase replacement was very 
dependent on these choices. Similarly, the delta between 
inclusion and exclusion of auto-decoding was highly dependent on 
these selections.
* DMD improvement by using -inline: 30% for asLowerCase (1.5x), 
90% for mapAsLowerCase (10x).
* DMD (-inline) vs LDC: For asLowerCase, LDC was 65-85% faster. 
For mapAsLowerCase, DMD was 10-40% faster. There were changes to 
the map implementation in 2.071, so these were not equivalent, 
but still, it's interesting that DMD beat LDC in this case.

Thoughts:
* The large variances between compiler settings imply extra 
diligence when performance tuning at the source code level, 
especially for code intended for multiple compilers.
* Perhaps DMD -O should also turn on -inline. This would present 
a better performance picture to new users. It's also helpful when 
the different compilers agree on rough meaning of compiler 
switches.
* Auto-decoding is an oft discussed concern. It doesn't show up 
in the table above, but the data I looked at suggests the 
cost/penalty may vary quite a bit depending on usage context and 
compiler/settings. I wasn't studying aspect explicitly. It may be 
worth its own analysis.

Other details:
* Code for mapAsLowerCase and the timing program is at: 
https://dpaste.dzfl.pl/a0e2fa1c71fd
* Texts used for timing were books in several languages from the 
Project Gutenberg site (http://www.gutenberg.org/), with 
boilerplate text removed.

--Jon

May 08 2016

Peter =?UTF-8?B?SMOkZ2dtYW4=?= <hagg.pet2 online.fi> writes:

On Sunday, 8 May 2016 at 23:38:31 UTC, Jon D wrote:
 I did a performance study on speeding up case conversion in 
 std.uni.asLowerCase. Specifics for asLowerCase have been added 
 to issue https://issues.dlang.org/show_bug.cgi?id=11229. 
 Publishing here as some of the more general observations may be 
 of wider interest.

 [...]

Nice, it seems that you would have enough material to advocate a 
pull request in phobos then ;)

May 08 2016

Jon D <jond noreply.com> writes:

On Monday, 9 May 2016 at 00:15:03 UTC, Peter Häggman wrote:
 On Sunday, 8 May 2016 at 23:38:31 UTC, Jon D wrote:
 I did a performance study on speeding up case conversion in 
 std.uni.asLowerCase. Specifics for asLowerCase have been added 
 to issue https://issues.dlang.org/show_bug.cgi?id=11229. 
 Publishing here as some of the more general observations may 
 be of wider interest.

 [...]

 Nice, it seems that you would have enough material to advocate 
 a pull request in phobos then ;)

Thanks! I haven't yet taken the time to go through the 'becoming 
a contributor' steps, when I have the time I'll do that. In this 
case, I'd want to start by validating with the library designers 
that the approach makes sense. It by-passes what appears to a 
basic primitive, std.uni.toCaser. There may be reasons this is 
not desirable.

May 08 2016

Lionello Lunesu <lionello lunesu.remove.com> writes:

On 9/5/2016 12:01, Jon D wrote:
 On Monday, 9 May 2016 at 00:15:03 UTC, Peter Häggman wrote:
 On Sunday, 8 May 2016 at 23:38:31 UTC, Jon D wrote:
 I did a performance study on speeding up case conversion in
 std.uni.asLowerCase. Specifics for asLowerCase have been added to
 issue https://issues.dlang.org/show_bug.cgi?id=11229. Publishing here
 as some of the more general observations may be of wider interest.

 [...]

 Nice, it seems that you would have enough material to advocate a pull
 request in phobos then ;)

 Thanks! I haven't yet taken the time to go through the 'becoming a
 contributor' steps, when I have the time I'll do that. In this case, I'd
 want to start by validating with the library designers that the approach
 makes sense. It by-passes what appears to a basic primitive,
 std.uni.toCaser. There may be reasons this is not desirable.

Less code, and faster, and using existing building blocks. Win, win, 
win, if you ask me!

+1

May 09 2016

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 09-May-2016 02:38, Jon D wrote:
 I did a performance study on speeding up case conversion in
 std.uni.asLowerCase. Specifics for asLowerCase have been added to issue
 https://issues.dlang.org/show_bug.cgi?id=11229. Publishing here as some
 of the more general observations may be of wider interest.

 Background - Case conversion can generally be sped up by checking if a
 character is ascii before invoking a full unicode case conversion. The
 single character std.uni.toLower does this optimization, but
 std.uni.asLowerCase does not. asLowerCase does a lazy conversion of a
 range. For the test, I created a replacement for asLowerCase which uses
 map and toLower. In essence, `map!(x => x.toLower)` or `map!(x =>
 x.byDchar.toLower)`.

The only problem is that it should consider multi-codepoint replacements 
aka full-case folding in Unicode.
Otherwise - go ahead and issue a pull request to add
special case for < 0x80.


-- 
Dmitry Olshansky

May 09 2016

Marc =?UTF-8?B?U2Now7x0eg==?= <schuetzm gmx.net> writes:

On Monday, 9 May 2016 at 08:44:53 UTC, Dmitry Olshansky wrote:
 On 09-May-2016 02:38, Jon D wrote:
 [...]

 The only problem is that it should consider multi-codepoint 
 replacements aka full-case folding in Unicode.
 Otherwise - go ahead and issue a pull request to add
 special case for < 0x80.

What about locale dependent case mappings (e.g. Turkish Iı İi)? 
Or is that currently not supported by std.uni?

May 11 2016

Marco Leise <Marco.Leise gmx.de> writes:

Am Wed, 11 May 2016 11:37:03 +0000
schrieb Marc Sch=C3=BCtz <schuetzm gmx.net>:

 On Monday, 9 May 2016 at 08:44:53 UTC, Dmitry Olshansky wrote:
 On 09-May-2016 02:38, Jon D wrote: =20
 [...] =20

 The only problem is that it should consider multi-codepoint=20
 replacements aka full-case folding in Unicode.
 Otherwise - go ahead and issue a pull request to add
 special case for < 0x80.


In case someone wonders, an example for multi-codepoint
replacements is when =C3=9F becomes SS in upper-case.

 What about locale dependent case mappings (e.g. Turkish I=C4=B1 =C4=B0i)?=

=20
 Or is that currently not supported by std.uni?

I second those thoughts. Important players* agree that you
cannot do case conversion or string sorting without knowing
the locale and sorting scheme, and Phobos that set out to be
"Unicode first" can't even express locales to begin with.

*
 C++:
 http://en.cppreference.com/w/cpp/locale/tolower
 POSIX:
 http://pubs.opengroup.org/onlinepubs/9699919799/functions/tolower.html
 Java:
 https://docs.oracle.com/javase/7/docs/api/java/lang/String.html#toLowerCas=
e%28java.util.Locale%29
 Go:
 Recently added special handling of Turkish, but lack of
 full-case-folding is still a "BUG" comment in the code.

The actual discussion on std.locale is here:
http://forum.dlang.org/thread/gof4ft$2odv$1 digitalmars.com

--=20
Marco

May 12 2016

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 12-May-2016 17:23, Marco Leise wrote:
 Am Wed, 11 May 2016 11:37:03 +0000
 schrieb Marc Schütz <schuetzm gmx.net>:

 On Monday, 9 May 2016 at 08:44:53 UTC, Dmitry Olshansky wrote:
 On 09-May-2016 02:38, Jon D wrote:
 [...]

 The only problem is that it should consider multi-codepoint
 replacements aka full-case folding in Unicode.
 Otherwise - go ahead and issue a pull request to add
 special case for < 0x80.


 In case someone wonders, an example for multi-codepoint
 replacements is when ß becomes SS in upper-case.

 What about locale dependent case mappings (e.g. Turkish Iı İi)?
 Or is that currently not supported by std.uni?

 I second those thoughts. Important players* agree that you
 cannot do case conversion or string sorting without knowing
 the locale and sorting scheme, and Phobos that set out to be
 "Unicode first" can't even express locales to begin with.

Proper handling of that is called tailoring in Unicode. Personally I 
don't think it would be terribly hard to go and do:

Locale locale = fetchSomeTailoredLocale(...);
locale.toLower(...); // etc. all functions as members

By the way we do what is called default handling (tables) which is 
sensible "default" IMO.

-- 
Dmitry Olshansky

May 12 2016

qznc <qznc web.de> writes:

On Sunday, 8 May 2016 at 23:38:31 UTC, Jon D wrote:
 * Performance varied widely between compilers, and for DMD, 
 whether the -inline flag was included. The performance delta 
 between asLowerCase and the mapAsLowerCase replacement was very 
 dependent on these choices. Similarly, the delta between 
 inclusion and exclusion of auto-decoding was highly dependent 
 on these selections.

This makes me suspicious, since you did not measure standard 
deviation or variance. A quick local test with your code looks 
ok, though.

One of my students recently made a benchmarking tool.
If you really want to do this properly, you could try that:
https://github.com/parttimenerd/temci

I would not require better statistics here, since the speedups 
seem to be stable. :)

May 09 2016

D Programming

C/C++ Programming

Other

digitalmars.D - Compiler benchmarks for an alternative to std.uni.asLowerCase.