www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 11229] New: std.string.toLower is slow

http://d.puremagic.com/issues/show_bug.cgi?id=11229

           Summary: std.string.toLower is slow
           Product: D
           Version: D2
          Platform: x86
        OS/Version: Windows
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: Phobos
        AssignedTo: nobody puremagic.com
        ReportedBy: bearophile_hugs eml.cc


--- Comment #0 from bearophile_hugs eml.cc 2013-10-11 16:56:53 PDT ---
As test text I have used the Gutenberg "Pride and Prejudice":
http://www.gutenberg.org/cache/epub/1342/pg1342.txt


The simple D program:

void main() {
    import std.string: toLower;
    import std.file: read;
    (cast(string)"pg1342.txt".read).toLower;
}


A similar Python2.6 program:


def main():
    open("pg1342.txt").read().lower()
main()



The Python code runs (including starting the interpreter) in about 0.07 seconds
on my Core2 PC.

The D version compiled with dmd 2.064 with -O -release -inline -noboundscheck
runs in about 0.30 seconds.

The two programs are not exactly equivalent because Python2.6 strings are not
Unicode. But I think a fast path for ASCII or near-ASCII text inside
std.string.toLower could bring its performance to something similar the Python
performance.
(A possible alternative solution is to introduce a function similar to toLower
function in the std.ascii module.)


Note that this program:

void main() {
    import std.ascii: toLower;
    import std.file: read;
    auto txt = cast(char[])"pg1342.txt".read;
    foreach (ref c; txt)
        c = cast(char)c.toLower;
}


Compiled in the same way runs in about 0.03 seconds.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Oct 11 2013