www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Regex benchmarks in Rust, Scala, D and F#

reply Karthikeyan <tir.karthi gmail.com> writes:
Hi,

Came across this post in rust-lang subreddit about the regex 
benchamrks. Scala surprisingly outperforms D. LDC also gives a 
good advantage for efficiency of D.

http://vaskir.blogspot.ru/2015/09/regular-expressions-rust-vs-f.html
Jan 05 2016
next sibling parent reply deadalnix <deadalnix gmail.com> writes:
On Tuesday, 5 January 2016 at 17:52:39 UTC, Karthikeyan wrote:
 Hi,

 Came across this post in rust-lang subreddit about the regex 
 benchamrks. Scala surprisingly outperforms D. LDC also gives a 
 good advantage for efficiency of D.

 http://vaskir.blogspot.ru/2015/09/regular-expressions-rust-vs-f.html
I'm willing to bet the bad result D has come from the use of DMD. Honestly, pushing DMD as the reference implementation cost us quite a lot on the PR side of things. D appears to be slower that it really is.
Jan 05 2016
next sibling parent reply =?UTF-8?Q?Martin_Dra=c5=a1ar?= via Digitalmars-d writes:
Dne 5.1.2016 v 19:09 deadalnix via Digitalmars-d napsal(a):
 On Tuesday, 5 January 2016 at 17:52:39 UTC, Karthikeyan wrote:
 Hi,

 Came across this post in rust-lang subreddit about the regex
 benchamrks. Scala surprisingly outperforms D. LDC also gives a good
 advantage for efficiency of D.

 http://vaskir.blogspot.ru/2015/09/regular-expressions-rust-vs-f.html
I'm willing to bet the bad result D has come from the use of DMD. Honestly, pushing DMD as the reference implementation cost us quite a lot on the PR side of things. D appears to be slower that it really is.
To be fair, they have results for DMD and LDC: regex - 10.6 s (DMD), 7.8 s (LDC) ctRegex! - 6.9 s (DMD), 6.6 s (LDC) Although no information about compiler switches.
Jan 05 2016
next sibling parent Karthikeyan <tir.karthi gmail.com> writes:
On Tuesday, 5 January 2016 at 18:13:00 UTC, Martin Drašar wrote:
 Dne 5.1.2016 v 19:09 deadalnix via Digitalmars-d napsal(a):
 On Tuesday, 5 January 2016 at 17:52:39 UTC, Karthikeyan wrote:
 Hi,

 Came across this post in rust-lang subreddit about the regex 
 benchamrks. Scala surprisingly outperforms D. LDC also gives 
 a good advantage for efficiency of D.

 http://vaskir.blogspot.ru/2015/09/regular-expressions-rust-vs-f.html
I'm willing to bet the bad result D has come from the use of DMD. Honestly, pushing DMD as the reference implementation cost us quite a lot on the PR side of things. D appears to be slower that it really is.
To be fair, they have results for DMD and LDC: regex - 10.6 s (DMD), 7.8 s (LDC) ctRegex! - 6.9 s (DMD), 6.6 s (LDC) Although no information about compiler switches.
Relevant reddit discussion.https://www.reddit.com/r/rust/comments/3zh95h/regular_expressions_rust_vs_f_vs_scala_vs_d/
Jan 05 2016
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 01/05/2016 01:13 PM, Martin Drašar via Digitalmars-d wrote:
 Dne 5.1.2016 v 19:09 deadalnix via Digitalmars-d napsal(a):
 On Tuesday, 5 January 2016 at 17:52:39 UTC, Karthikeyan wrote:
 Hi,

 Came across this post in rust-lang subreddit about the regex
 benchamrks. Scala surprisingly outperforms D. LDC also gives a good
 advantage for efficiency of D.

 http://vaskir.blogspot.ru/2015/09/regular-expressions-rust-vs-f.html
I'm willing to bet the bad result D has come from the use of DMD. Honestly, pushing DMD as the reference implementation cost us quite a lot on the PR side of things. D appears to be slower that it really is.
To be fair, they have results for DMD and LDC: regex - 10.6 s (DMD), 7.8 s (LDC) ctRegex! - 6.9 s (DMD), 6.6 s (LDC) Although no information about compiler switches.
The benchmark measures a mixture of times, not only regex time. Could somebody tweak the benchmark and figure what the individual timings are (I/O vs. regex vs. the pipeline vs. appending)? -- Andrei
Jan 05 2016
prev sibling parent Daniel Kozak <kozzi11 gmail.com> writes:
On Tuesday, 5 January 2016 at 18:09:54 UTC, deadalnix wrote:
 On Tuesday, 5 January 2016 at 17:52:39 UTC, Karthikeyan wrote:
 Hi,

 Came across this post in rust-lang subreddit about the regex 
 benchamrks. Scala surprisingly outperforms D. LDC also gives a 
 good advantage for efficiency of D.

 http://vaskir.blogspot.ru/2015/09/regular-expressions-rust-vs-f.html
I'm willing to bet the bad result D has come from the use of DMD. Honestly, pushing DMD as the reference implementation cost us quite a lot on the PR side of things. D appears to be slower that it really is.
not at all: D(LDC): 5.778s D(GDC): 5.612s D(DMD): 5.267s scalac: 5.748s rustc: 9.287s so DMD is the fastest
Jan 06 2016
prev sibling next sibling parent reply Gerald <me me.com> writes:
On Tuesday, 5 January 2016 at 17:52:39 UTC, Karthikeyan wrote:
 Hi,

 Came across this post in rust-lang subreddit about the regex 
 benchamrks. Scala surprisingly outperforms D. LDC also gives a 
 good advantage for efficiency of D.

 http://vaskir.blogspot.ru/2015/09/regular-expressions-rust-vs-f.html
I notice he's using readln() instead of readln(buf) in the D solution, would having D re-use the buffer make a substantial improvement in performance?
Jan 05 2016
parent reply Basile B. <b2.temp gmx.com> writes:
On Tuesday, 5 January 2016 at 18:19:23 UTC, Gerald wrote:
 I notice he's using readln() instead of readln(buf) in the D 
 solution, would having D re-use the buffer make a substantial 
 improvement in performance?
Yep. It's a life-changer. There's a before and an after.
Jan 05 2016
parent Gerald <me me.com> writes:
On Tuesday, 5 January 2016 at 18:21:39 UTC, Basile B. wrote:
 On Tuesday, 5 January 2016 at 18:19:23 UTC, Gerald wrote:
 I notice he's using readln() instead of readln(buf) in the D 
 solution, would having D re-use the buffer make a substantial 
 improvement in performance?
Yep. It's a life-changer. There's a before and an after.
Tried it on my laptop, only shaved half a second off the total time, was hoping for more.
Jan 05 2016
prev sibling next sibling parent reply rsw0x <anonymous anonymous.com> writes:
On Tuesday, 5 January 2016 at 17:52:39 UTC, Karthikeyan wrote:
 Hi,

 Came across this post in rust-lang subreddit about the regex 
 benchamrks. Scala surprisingly outperforms D. LDC also gives a 
 good advantage for efficiency of D.

 http://vaskir.blogspot.ru/2015/09/regular-expressions-rust-vs-f.html
k, I optimized it for fun https://paste.ee/p/Bb1Ns I also found an mmfile bug if someone could report it for me, if I don't add a root to the mmfile contents(or disable the GC) the program will crash ctregex for both his: Elapsed: 6432 mine: Elapsed: 3123 ldc -O3 -release -boundscheck=off -singleobj regex.d does not compile with the latest gdc on arch linux(no lineSplitter) you could probably get it faster by reusing a buffer(I forget which call this is, but it exists in phobos) when reading the file but I felt like using mmfile as I've never used it before. Make sure you use a block-size buffer(e.g, 4096) Bye.
Jan 05 2016
parent reply Messenger <dont shoot.me> writes:
On Tuesday, 5 January 2016 at 20:04:35 UTC, rsw0x wrote:
[...]
Anyone on linux who could imgur a callgraph please? Premature optimisation and all that.
Jan 05 2016
parent rsw0x <anonymous anonymous.com> writes:
On Tuesday, 5 January 2016 at 23:24:24 UTC, Messenger wrote:
 On Tuesday, 5 January 2016 at 20:04:35 UTC, rsw0x wrote:
[...]
Anyone on linux who could imgur a callgraph please? Premature optimisation and all that.
nearly all the time is spent inside the regex itself and filtering out empty results, moving the file to /tmp(ramdisk) does nothing for performance with the way the problem is structured I'm not sure how it could be changed beyond some micro-optimizations, as you can't use matchAll afaict because the benchmark requires it to be line-by-line well, you probably could I just don't feel like coding it because I hate regex
Jan 05 2016
prev sibling parent reply israel <tl12000 live.com> writes:
On Tuesday, 5 January 2016 at 17:52:39 UTC, Karthikeyan wrote:
 Hi,

 Came across this post in rust-lang subreddit about the regex 
 benchamrks. Scala surprisingly outperforms D. LDC also gives a 
 good advantage for efficiency of D.

 http://vaskir.blogspot.ru/2015/09/regular-expressions-rust-vs-f.html
I think the problem with these "benchmarks" is that when their favorite language is up there and not doing as good as the others, people begin to yell out that they didnt optimize the code well, either through compiler flags or something else. There should be a public benchmark standard. No special functions. No special linker flags. Just the plain code and compilation process.
Jan 05 2016
parent wobbles <grogan.colin gmail.com> writes:
On Wednesday, 6 January 2016 at 07:05:43 UTC, israel wrote:
 On Tuesday, 5 January 2016 at 17:52:39 UTC, Karthikeyan wrote:
 Hi,

 Came across this post in rust-lang subreddit about the regex 
 benchamrks. Scala surprisingly outperforms D. LDC also gives a 
 good advantage for efficiency of D.

 http://vaskir.blogspot.ru/2015/09/regular-expressions-rust-vs-f.html
I think the problem with these "benchmarks" is that when their favorite language is up there and not doing as good as the others, people begin to yell out that they didnt optimize the code well, either through compiler flags or something else. There should be a public benchmark standard. No special functions. No special linker flags. Just the plain code and compilation process.
That'll never work though. 'Just the plain code' to me isn't 'just the plain code' to you. Ideally, a Git repo somewhere with a lot of benchmarks that the community can edit and make better. Over time (assuming the repo becomes somewhat popular) all benchmarking programs will use each language to it's fullest - thus giving accurate, comparable results across the board.
Jan 06 2016