digitalmars.D - Parallel Rogue-like benchmark

bearophile (27/48) Aug 23 2013 The author of the serial language comparison has now created a

bearophile (4/4) Aug 23 2013 The missing link to the Reddit thread:

Andrei Alexandrescu (8/10) Aug 23 2013 Awesome, upboat!

H. S. Teoh (24/46) Aug 23 2013 [...]

Justin Whear (4/16) Aug 23 2013 Let's get that character count down :)

Ramon (19/19) Aug 23 2013 I like it and see an interesting mix of concepts in Nimrod.

Geancarlo Rocha (6/27) Aug 23 2013 Not really intrinsic to the language(syntactically), but there is

bearophile (27/32) Aug 23 2013 I agree with you. When you show a table to people, and such table
Paul Jurczak (9/11) Aug 23 2013 I agree that LOC is a very poor measure, but I think the intent
Jesse Phillips (10/12) Aug 23 2013 While LOC isn't a very good metric, you're complaining about

Walter Bright (5/6) Aug 23 2013 In general, I regard a "line of code" as one statement or one declaratio...

H. S. Teoh (73/82) Aug 23 2013 I am still skeptical of LOC as a reliable measure of language

Walter Bright (3/15) Aug 23 2013 Like I said, you can still game it. I think some common sense applies, n...

H. S. Teoh (16/33) Aug 23 2013 You conveniently snipped the rest of my post, which postulates a far

Walter Bright (5/12) Aug 24 2013 You can't compress by visually looking at the code, and LOC is a unit th...

H. S. Teoh (15/23) Aug 24 2013 [...]

Ramon (12/12) Aug 24 2013 I think that there is a lot speaking against sloc.

Paulo Pinto (13/25) Aug 25 2013 I once worked in a project where QA and PM people would look at
Joseph Rushton Wakeling (6/9) Aug 25 2013 Don't know about a web server, but I remember somewhere online I found t...

Andrei Alexandrescu (6/17) Aug 25 2013 This is one of the worst PR functional programming has ever gotten, and

Joseph Rushton Wakeling (2/6) Aug 25 2013 I thought you'd like me bringing this up. ;-)

Paul Jurczak (5/14) Aug 25 2013 You still have a chance, because I don't quite get it. With the

bearophile (23/26) Aug 25 2013 A faithful QuickShort should work in-place, unlike that code.
Meta (4/7) Aug 25 2013 It's a huge blowup in time complexity. They say that Lisp
deadalnix (2/17) Aug 25 2013 It is a quick copy copy copy no so quick sort.
Andrei Alexandrescu (11/25) Aug 25 2013 I ranted about that a couple of times (including during my DConf talk).

Paul Jurczak (7/41) Aug 25 2013 @Andrei, @deadalnix, @Meta, @bearophile

Russel Winder (19/31) Aug 26 2013 is=20

deadalnix (3/28) Aug 26 2013 It goes throw data 3 times : to filter (twice) and to to concat

Joseph Rushton Wakeling (9/15) Aug 26 2013 If anyone hadn't already realized, my citation of this example was a jok...

Jesse Phillips (4/17) Aug 24 2013 See above. What use is an objective measure when peoples taste
Walter Bright (5/10) Aug 24 2013 It's not so easy to run zip on a snippet in a magazine article, as oppos...

Paul Jurczak (5/9) Aug 25 2013 [..]

bearophile (7/9) Aug 25 2013 On the other hand if you use a very good compressor (like the

H. S. Teoh (7/11) Aug 23 2013 [...]
Joseph Rushton Wakeling (18/35) Aug 24 2013 It's a heuristic, rather than a metric, I'd say. But as a heuristic it ...

Walter Bright (2/5) Aug 23 2013 And note how well D did in the speed tests!

Jesse Phillips (2/9) Aug 23 2013 It gets second no matter how you read it!

Meta (3/4) Aug 23 2013 It was also not very idiomatic. It looks like some performance

Marco Leise (4/9) Nov 07 2013 I made it idiomatic, D is on place 1 now by a big margin. See the

bearophile (12/16) Nov 07 2013 Very nice. I have made a more idiomatic version (in D global

bearophile (6/7) Nov 07 2013 That gives me 83 cloc (http://cloc.sourceforge.net ) lines of

Marco Leise (18/27) Nov 07 2013 foreach (immutable xi; r.x .. r.x + r.w + 1)

bearophile (15/30) Nov 07 2013 It's an enhancement that I requested, and Kenji implemented some

Daniel Davidson (11/27) Nov 07 2013 Regarding what is idiomatic D, isn't `immutable x = rnd.next %

Marco Leise (11/22) Nov 07 2013 Yes it is pedantic and I don't mind if anyone objects. :)
Dmitry Olshansky (7/35) Nov 07 2013 immutable is useful for non-value types. Say strings, then wherever

bearophile (7/11) Nov 07 2013 It's a little pedantic, and it's some characters longer than

Dicebot (6/9) Nov 07 2013 I actually prefer usage of `immutable` by default for value types

Joseph Rushton Wakeling (3/10) Nov 09 2013 How does the speed of that code change if instead of the Random struct, ...

bearophile (11/13) Nov 09 2013 That change of yours was well studied in the first blog post (the

Dmitry Olshansky (8/14) Nov 09 2013 Lack of inlining most likely.
logicchains (21/27) Nov 09 2013 I imagine (although I haven't checked) that std.random.Xorshift32

Joseph Rushton Wakeling (3/4) Nov 10 2013 Ahh, makes sense. Where did you get the particular RNG you used? I don...

logicchains (6/11) Nov 10 2013 I got it from here:

Joseph Rushton Wakeling (4/6) Nov 07 2013 Slightly baffled to see "ldc2" and "ldmd2" listed as two separate entrie...

bearophile (4/8) Nov 07 2013 Yes, I think the ldmd2 "entry" should be removed...

logicchains (20/28) Nov 07 2013 Benchmark author here. I left the ldmd2 entry there to represent

bearophile (29/38) Nov 08 2013 ldmd2 is not a compiler, it's just a thin wrapper that helps use

logicchains (2/15) Nov 08 2013

Marco Leise (8/10) Nov 08 2013 It is the natural representation of an array index. It is

Dicebot (3/5) Nov 08 2013 Erm? I am pretty sure "_t" is just a short form for "type",
logicchains (3/11) Nov 08 2013 Ah, right. I'll bear it in mind if I'm ever writing

bearophile (12/12) Nov 08 2013 Your site counts 90 SLOC for the D entry, that comes from 83

logicchains (4/16) Nov 08 2013 Okay, I've updated it to 83. The other entries didn't include

bearophile (5/8) Nov 09 2013 Thank you :-) I think few comments help the code look more

John Colvin (5/7) Nov 08 2013 Using size_t as array indices is a c/c++ convention that is also

Joseph Rushton Wakeling (5/11) Nov 09 2013 Seems fine to me to compare two different code implementations, but disp...

Rusty Shackleford (6/9) Aug 23 2013 Off-topic:

"bearophile" <bearophileHUGS lycos.com> writes:

The author of the serial language comparison has now created a 
simple parallel comparison:

http://togototo.wordpress.com/2013/08/23/benchmarks-round-two-parallel-go-rust-d-scala-and-nimrod/

 From the blog post:

but for the D and Rust implementations only the single-threaded 
portion was written idiomatically; the parallelisation of that 
code was done naively by myself.<

The single-threaded D version of the code is not idiomatic, it's 
very C-like code.

This was the idiomatic serial D version:
https://github.com/logicchains/levgen-benchmarks/blob/master/D-XorShift.d


Also the author keeps changing the way he measures languages, and 
in the meantime he has added the line count metric too, so now 
the D version should be modified so code like this takes 2 lines 
instead of 12:

struct Tile {
     int X = void;
     int Y = void;
     int T = void;
}

struct Room {
     int X = void;
     int Y = void;
     int W = void;
     int H = void;
     int N = void;
}


More from the blog post:

The D implementation also surprised me, although I think this 
was largely because I was unfamiliar with D’s memory model. I 
originally tried having each thread write to part of a global 
array, like in the C implementation, but, after they had 
completed their task and the time came to read the data, the 
array was empty, possibly having been garbage collected. The 
solution was to mark the array as shared, which required 
propagating the chain of ‘shared’ typing to the objects being 
written to the array in order to ensure safety. This was an 
interesting difference from Rust’s memory safety model, where 
the type of pointers rather than objects determines how they can 
be shared, but I’m not familiar enough with D’s memory model to 
comment on their relative effectiveness.<

I parallelised the D and Rust programs myself, hence the code is 
probably not idiomatic, and still has plenty of room for 
improvement. D for instance has a parallel for loop; I couldn’t 
get it working, but if someone got it working then it would 
significantly reduce the size of the code.<

Bye,
bearophile

Aug 23 2013

"bearophile" <bearophileHUGS lycos.com> writes:

The missing link to the Reddit thread:

http://www.reddit.com/r/programming/comments/1kxt7w/parallel_roguelike_levgen_benchmarks_rust_go_d/

Bye,
bearophile

Aug 23 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 8/23/13 9:29 AM, bearophile wrote:
 The missing link to the Reddit thread:

 http://www.reddit.com/r/programming/comments/1kxt7w/parallel_roguelike_levgen_benchmarks_rust_go_d/

Awesome, upboat!

I am mostly speculating, but in the past few months I subjectively 
perceive we've turned a corner of sorts in terms outlook and adoption by 
the larger community. Complaints about quality have dwindled and more 
people report trying D and finding it impressive. Seems we're doing 
something right. Congratulations for the hard work for everyone involved!


Andrei

Aug 23 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, Aug 23, 2013 at 03:48:38PM +0200, bearophile wrote:
 The author of the serial language comparison has now created a
 simple parallel comparison:
 
 http://togototo.wordpress.com/2013/08/23/benchmarks-round-two-parallel-go-rust-d-scala-and-nimrod/

[...]
 Also the author keeps changing the way he measures languages, and in
 the meantime he has added the line count metric too, so now the D
 version should be modified so code like this takes 2 lines instead
 of 12:
 
 struct Tile {
     int X = void;
     int Y = void;
     int T = void;
 }
 
 struct Room {
     int X = void;
     int Y = void;
     int W = void;
     int H = void;
     int N = void;
 }

[...]

Seriously, I don't understand what's with this obsession with line
count metrics. Here's a 2-line version of the above code:

struct Tile { int X = void; int Y = void; int T = void; }
struct Room { int X = void; int Y = void; int W = void; int H = void; int N =
void; }

which, as you can tell, says *nothing* whatsoever about the
expressiveness of the language. I mean, how would you account for the
possibility that perhaps the D community adopted this 2-line style of
coding as the "preferred" D style? Since we adopted a style that uses up
more lines, why should the *language* be penalized for having a higher
line count, when the 2-line version is equally acceptable to the
compiler and has exactly the same semantics?

Or, put another way, suppose I rename "void" to "nada", and then call
the resulting language E, and I postulate that in E, the preferred
coding style in my 2-line format above. Does this then mean that E is
more expressive than D? Certainly not! But according to the line count
metric, it does.

Frankly, the fact that line counts are used at all has already
decremented the author's credibility for me.


T

-- 
English is useful because it is a mess. Since English is a mess, it maps well
onto the problem space, which is also a mess, which we call reality. Similarly,
Perl was designed to be a mess, though in the nicests of all possible ways. --
Larry Wall

Aug 23 2013

Justin Whear <justin economicmodeling.com> writes:

On Fri, 23 Aug 2013 09:48:54 -0700, H. S. Teoh wrote:

 Seriously, I don't understand what's with this obsession with line count
 metrics. Here's a 2-line version of the above code:
 
 struct Tile { int X = void; int Y = void; int T = void; }
 struct Room { int X = void; int Y = void; int W = void; int H = void;
 int N = void; }
 
 Frankly, the fact that line counts are used at all has already
 decremented the author's credibility for me.
 
 
 T

Let's get that character count down :)

struct Tile { int X=void, Y=void, T=void; }
struct Room { int X=void, Y=void, W=void, H=void, N=void; }

Aug 23 2013

"Ramon" <spam thanks.no> writes:

I like it and see an interesting mix of concepts in Nimrod.

That said, I didn't and still don't see the major breakthrough or
value of {} vs. begin/end vs. Python style. While I agree that
Python enforces some visual style I also see that this question
always comes down to personal philosophy and discipline. You can
write clean looking code in D as well as in Python. If someone
really really feels, say, begin/end to be earthshakingly,
strategically better than {}, he is free to edit the language
spec and to create his private D (or whatever) with begin/end
rather than {}.

More importantly, I feel that whole sloc thingy misses the major
point. Code is written for 2 situations, once for the compiler
(which doesn't care too much about human readability) and, imo
way more important, maintenance. *That*, maintenance, is what
readability is or should be mostly all about.

I'll keep an occasional eye on Nimrod but I fail to spot anything
major or strategic enough to warrant even considering leaving D
for Nimrod.

R

Aug 23 2013

"Geancarlo Rocha" <nope mailinator.com> writes:

Not really intrinsic to the language(syntactically), but there is
the "soft realtime GC", meaning you can control when and for how
long the gc can do the collecting. Sounds like a lovely feature
for games.

http://nimrod-code.org/gc.html

On Friday, 23 August 2013 at 17:33:12 UTC, Ramon wrote:
 I like it and see an interesting mix of concepts in Nimrod.

 That said, I didn't and still don't see the major breakthrough 
 or
 value of {} vs. begin/end vs. Python style. While I agree that
 Python enforces some visual style I also see that this question
 always comes down to personal philosophy and discipline. You can
 write clean looking code in D as well as in Python. If someone
 really really feels, say, begin/end to be earthshakingly,
 strategically better than {}, he is free to edit the language
 spec and to create his private D (or whatever) with begin/end
 rather than {}.

 More importantly, I feel that whole sloc thingy misses the major
 point. Code is written for 2 situations, once for the compiler
 (which doesn't care too much about human readability) and, imo
 way more important, maintenance. *That*, maintenance, is what
 readability is or should be mostly all about.

 I'll keep an occasional eye on Nimrod but I fail to spot 
 anything
 major or strategic enough to warrant even considering leaving D
 for Nimrod.

 R

Aug 23 2013

"bearophile" <bearophileHUGS lycos.com> writes:

H. S. Teoh:

 Seriously, I don't understand what's with this obsession with 
 line count metrics.
 ...
 Frankly, the fact that line counts are used at all has already
 decremented the author's credibility for me.

I agree with you. When you show a table to people, and such table 
is ranked according to run-time and cloc counts, some readers 
will judge better languages the ones that have lower run-time and 
lower line count.

In my D code I like to add unittests, contracts, asserts for loop 
invariants, and so on. All such things increase the line count. A 
person could say that in theory a better language should offer 
safety even without all those extra tests, so those extra lines 
of code should be counted for the length of the D entry, and 
removing them is not fair to improve the D line count metric.

But in most languages (including D, C, Python, etc) you can write 
code that has various degrees of safety. This means that a D 
entry without those compile-time and run-time tests is just as 
valid as D code with those tests.

This means using just the line count metric is not enough, you 
somehow have to produce entries that show similar level of safety 
(and doing this across different languages is not easy), only now 
the metric count is meaningful.

In the meantime in Rosettacode and other places where D code is 
shown and compared to code written in other languages I put all 
the run-time and compile-time tests I feel necessary, because for 
a professional programmer that's the right thing to do, from an 
engineering point of view.

Said that, in such comparisons wasting space is not a good idea.

Bye,
bearophile

Aug 23 2013

"Paul Jurczak" <pauljurczak yahoo.com> writes:

On Friday, 23 August 2013 at 16:50:27 UTC, H. S. Teoh wrote:
[..]
 Frankly, the fact that line counts are used at all has already
 decremented the author's credibility for me.

I agree that LOC is a very poor measure, but I think the intent 
was to offer some sort of comparison of syntactic complexity or 
code bloat (think Haskell vs. Java). Counting lines or non 
white-space characters is a low hanging fruit - anything more 
sophisticated seems to be quite difficult.

More importantly, LOC is also used in relative mode to show the 
fraction of code, which had to be modified for parallelism.

Aug 23 2013

"Jesse Phillips" <Jesse.K.Phillips+D gmail.com> writes:

On Friday, 23 August 2013 at 16:50:27 UTC, H. S. Teoh wrote:
 Seriously, I don't understand what's with this obsession with 
 line count metrics.

While LOC isn't a very good metric, you're complaining about 
things that aren't really there. Yes you can shorten it to 2 
lines, but did he? It looks like he had consistent formatting 
style across the languages.

If we decided that 2 lines was how we do formatting, what is 
wrong with that? If it is harder to read (and I believe it is) 
then our code examples aren't going to attract people even though 
the line count is down.

All that said, I think the delta is a nice statement.

Aug 23 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 8/23/2013 7:10 PM, Jesse Phillips wrote:
 If we decided that 2 lines was how we do formatting,

In general, I regard a "line of code" as one statement or one declaration. 
Comments don't count, nor does cramming 3 statements into one line make it one
LOC.

Of course, you can still game that, but if you're reasonable then it is a 
reasonable measure.

Aug 23 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, Aug 23, 2013 at 08:25:20PM -0700, Walter Bright wrote:
 On 8/23/2013 7:10 PM, Jesse Phillips wrote:
If we decided that 2 lines was how we do formatting,

 
 In general, I regard a "line of code" as one statement or one
 declaration. Comments don't count, nor does cramming 3 statements into
 one line make it one LOC.
 
 Of course, you can still game that, but if you're reasonable then it
 is a reasonable measure.

I am still skeptical of LOC as a reliable measure of language
complexity. How many LOC does the following code have?

	auto formatYear(int year, int monthsPerRow)
	{
	    enum colSpacing = 1;
	    return
		datesInYear(year)
		.byMonth()
		.chunks(monthsPerRow)
		.map!(r =>
			r.formatMonths()
			 .array()
			 .pasteBlocks(colSpacing)
			 .join("\n"))
		.join("\n\n");
	}

It's 14 lines using the above formatting, but the { and } are arguably
discountable, which makes it 12 lines of code.

But by your metric of one statement / one declaration, this is only 3
lines of code. Or, if you count the lambda separately, 4 lines of code.

That's a variation by a factor of at least 3, which makes it an
unreliable metric.

So what would you do? Declare that each function call should count as 1
LOC? How then would you count the LOC in the following code?

	void main(string[] args) {
	    int x = max(2, min(5, args[1].to!int));
	}

Is it 2 lines of code or 4 (since it's 3 function calls + 1 declaration
of main)? Again, you have a variation by a factor of 2, which makes any
results derived from LOC unreliable at best.

In none of the above examples did I try to deliberately game with the
metric. But the metric is still pretty inaccurate, and requires
subjective judgment calls.

A far more reliable measure of code complexity is to look at the
compressed size of the source code (e.g., with zip), which is an
approximation of the Kolgomorov complexity of the text, roughly
equivalent to the amount of information content it has.

Say you're comparing two functionally equivalent programs P1 and P2 in
two languages L1 and L2, respectively. Then the compressed size of P1
and P2, respectively, measure roughly how complex the source code must
be in order to express the program. The smaller the compressed size, the
more expressive the language, because it means that you only need to use
a small amount of complexity in order to express the program.

Also interesting is the ratio of the uncompressed source code size to
its compressed size: this ratio tells you how verbose the language is.
The higher the ratio, the more verbose the language: roughly speaking,
it requires more characters to represent a given amount of complexity.
If this ratio is low, it means that the source code is pretty terse; it
only requires a small amount of characters to express a given amount of
complexity.

While the above is obviously still an approximate metric, it at least
has some scientific basis in information theory, and I would put more
trust in this than in LOC. Already, it tells you that there are two
related but different factors at play: the inherent expressiveness of
the language (the absolute size of the compressed source code of a
program of fixed functionality, given a fixed compression algorithm),
and the verbosity of its syntax (the ratio of the uncompressed source
with the compressed source). It's the inherent expressiveness that's a
better measure of a language's expressiveness, because it doesn't
unfairly penalize a language for having, say, longer, more descriptive
identifiers, but it does penalize a language for requiring more
convoluted structures to express a given task.  Verbosity may matter
more to us human coders, especially those of us who like text editors,
but it doesn't fairly measure the inherent expressiveness of the
language.

LOC not only doesn't even begin to approximate Kolgomorov complexity, it
conflates these two related but ultimately different characteristics of
the language. Thus, any conclusions drawn from LOC are bound to be
highly unreliable.


T

--
Designer clothes: how to cover less by paying more.

Aug 23 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 8/23/2013 9:58 PM, H. S. Teoh wrote:
 On Fri, Aug 23, 2013 at 08:25:20PM -0700, Walter Bright wrote:
 On 8/23/2013 7:10 PM, Jesse Phillips wrote:
 If we decided that 2 lines was how we do formatting,

 In general, I regard a "line of code" as one statement or one
 declaration. Comments don't count, nor does cramming 3 statements into
 one line make it one LOC.

 Of course, you can still game that, but if you're reasonable then it
 is a reasonable measure.

 I am still skeptical of LOC as a reliable measure of language
 complexity. How many LOC does the following code have?

Like I said, you can still game it. I think some common sense applies, not a 
literal interpretation.

Aug 23 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, Aug 23, 2013 at 10:13:56PM -0700, Walter Bright wrote:
 On 8/23/2013 9:58 PM, H. S. Teoh wrote:
On Fri, Aug 23, 2013 at 08:25:20PM -0700, Walter Bright wrote:
On 8/23/2013 7:10 PM, Jesse Phillips wrote:
If we decided that 2 lines was how we do formatting,

In general, I regard a "line of code" as one statement or one
declaration. Comments don't count, nor does cramming 3 statements
into one line make it one LOC.

Of course, you can still game that, but if you're reasonable then it
is a reasonable measure.

I am still skeptical of LOC as a reliable measure of language
complexity. How many LOC does the following code have?

 
 Like I said, you can still game it. I think some common sense
 applies, not a literal interpretation.

You conveniently snipped the rest of my post, which postulates a far
better metric that's no harder to apply in practice. :)

Comparing the compressed size of the source code is far more reliable
than LOC. The absolute compressed size gives you an approximation of the
Kolmogorov complexity of the source, which approximates the
expressiveness of the language if you fix the functionality of the
program you're comparing. The ratio of the uncompressed size to the
compressed size gives you an approximation of the verbosity of the
language's syntax.

All it takes is for you to run zip instead of wc -l, and you have a far
better metric for measuring language expressiveness. Insisting on using
LOC despite this just makes one lose one's credibility.


T

-- 
They say that "guns don't kill people, people kill people." Well I think the
gun helps. If you just stood there and yelled BANG, I don't think you'd kill
too many people. -- Eddie Izzard, Dressed to Kill

Aug 23 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 8/23/2013 10:23 PM, H. S. Teoh wrote:
 Like I said, you can still game it. I think some common sense
 applies, not a literal interpretation.

 You conveniently snipped the rest of my post, which postulates a far
 better metric that's no harder to apply in practice. :)

You can't compress by visually looking at the code, and LOC is a unit that 
people fairly intuitively understand.


 All it takes is for you to run zip instead of wc -l, and you have a far
 better metric for measuring language expressiveness. Insisting on using
 LOC despite this just makes one lose one's credibility.

LOC ain't that far off, if you use it with some common sense rather than
literal 
dogma.

Aug 24 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Sat, Aug 24, 2013 at 12:13:06AM -0700, Walter Bright wrote:
 On 8/23/2013 10:23 PM, H. S. Teoh wrote:
Like I said, you can still game it. I think some common sense
applies, not a literal interpretation.

You conveniently snipped the rest of my post, which postulates a far
better metric that's no harder to apply in practice. :)

 
 You can't compress by visually looking at the code, and LOC is a
 unit that people fairly intuitively understand.

[...]

What's so difficult about running zip on the code?

The fault of LOC is precisely that people "fairly intuitively"
understand it. The problem is that no two people's intuitions ever
match.  So any conclusions drawn from LOC must necessarily be
subjective, and really not that much better than saying "I feel like
language A is better than language B, because I just like it better, and
besides, it has a nicer color (aka better LOC or whatever other
subjective metric)."

Which is fine, if that's what you're looking for. But let's not pretend
it has anything to do with the objective quality of a language.


T

-- 
Let's not fight disease by killing the patient. -- Sean 'Shaleh' Perry

Aug 24 2013

"Ramon" <spam thanks.no> writes:

I think that there is a lot speaking against sloc.

First it's often (ab?)used for "Ha! My language x is better than 
yours. I can write a web server in 3 lines, you need 30".
And then slocs say a lot of things about a lot of things. Like: 
Experience (being new or not used to X I'll need more lines in X 
than in "my" lang, say, D), built in vs. library, coding style, 
and others.

In the end "100 sloc in X vs. 180 sloc in Y" quite often means 
close to nothing. Putting it bluntly: If I know enough about all 
the relevant details of the languages+environment to arrive at 
having sloc mean anything useful, I won't need that comparison in 
the first place.

Aug 24 2013

"Paulo Pinto" <pjmp progtools.org> writes:

On Saturday, 24 August 2013 at 17:01:56 UTC, Ramon wrote:
 I think that there is a lot speaking against sloc.

 First it's often (ab?)used for "Ha! My language x is better 
 than yours. I can write a web server in 3 lines, you need 30".
 And then slocs say a lot of things about a lot of things. Like: 
 Experience (being new or not used to X I'll need more lines in 
 X than in "my" lang, say, D), built in vs. library, coding 
 style, and others.

 In the end "100 sloc in X vs. 180 sloc in Y" quite often means 
 close to nothing. Putting it bluntly: If I know enough about 
 all the relevant details of the languages+environment to arrive 
 at having sloc mean anything useful, I won't need that 
 comparison in the first place.

I once worked in a project where QA and PM people would look at 
all the metrics one can generate out of Sonar 
(http://www.sonarqube.org/) and as side-effect dictate software 
design decisions based on them.

The reason being, we the coders, started to write code and unit 
tests that would drive the metrics into the numbers given as 
target for the project metrics.

If you don't know it, there is an online version to play with here
http://nemo.sonarqube.org/

We joked it was QDD, QA Driven Development.

--
Paulo

Aug 25 2013

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 24/08/13 19:01, Ramon wrote:
 I think that there is a lot speaking against sloc.

 First it's often (ab?)used for "Ha! My language x is better than yours. I can
 write a web server in 3 lines, you need 30".

Don't know about a web server, but I remember somewhere online I found this 
really cool 3-line quicksort that you can do in Haskell :-)

     qsort []     = []
     qsort (x:xs) = qsort (filter (< x) xs) ++ [x]
                    ++ qsort (filter (>= x) xs)

Aug 25 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 8/25/13 3:57 PM, Joseph Rushton Wakeling wrote:
 On 24/08/13 19:01, Ramon wrote:
 I think that there is a lot speaking against sloc.

 First it's often (ab?)used for "Ha! My language x is better than
 yours. I can
 write a web server in 3 lines, you need 30".

 Don't know about a web server, but I remember somewhere online I found
 this really cool 3-line quicksort that you can do in Haskell :-)

      qsort []     = []
      qsort (x:xs) = qsort (filter (< x) xs) ++ [x]
                     ++ qsort (filter (>= x) xs)

This is one of the worst PR functional programming has ever gotten, and 
one of the worst things FP has done to the larger community. Somebody 
should do hard time for this. And yes, for that matter it's a great 
example in which SLOCs are not a very good measure.

Andrei

Aug 25 2013

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 26/08/13 01:06, Andrei Alexandrescu wrote:
 This is one of the worst PR functional programming has ever gotten, and one of
 the worst things FP has done to the larger community. Somebody should do hard
 time for this. And yes, for that matter it's a great example in which SLOCs are
 not a very good measure.

I thought you'd like me bringing this up. ;-)

Aug 25 2013

"Paul Jurczak" <pauljurczak yahoo.com> writes:

On Sunday, 25 August 2013 at 23:16:17 UTC, Joseph Rushton 
Wakeling wrote:
 On 26/08/13 01:06, Andrei Alexandrescu wrote:
 This is one of the worst PR functional programming has ever 
 gotten, and one of
 the worst things FP has done to the larger community. Somebody 
 should do hard
 time for this. And yes, for that matter it's a great example 
 in which SLOCs are
 not a very good measure.

 I thought you'd like me bringing this up. ;-)

You still have a chance, because I don't quite get it. With the 
little I know about Haskell, I find this code very elegant. What 
is wrong with it? Performance?

Aug 25 2013

"bearophile" <bearophileHUGS lycos.com> writes:

Paul Jurczak:

 You still have a chance, because I don't quite get it. With the 
 little I know about Haskell, I find this code very elegant. 
 What is wrong with it? Performance?

A faithful QuickShort should work in-place, unlike that code.

This is an implementation of a similar functional algorithm in D, 
a "fake" QuickSort:


import std.stdio, std.algorithm, std.range, std.array;

auto quickSort(T)(T[] items) /*pure*/ nothrow {
     if (items.length < 2)
         return items;
     auto pivot = items[0];
     return items[1 .. $].filter!(x => x < pivot).array.quickSort ~
            pivot ~
            items[1 .. $].filter!(x => x >= pivot).array.quickSort;
}

void main() {
     [4, 65, 2, -31, 0, 99, 2, 83, 782, 1].quickSort.writeln;
}



It's a similar situation with the Sieve of Eratosthenes. There 
are ways to write an acceptable (but a little slower) Sieve of E. 
with immutable lists, but many of the functional little 
implementations you see around are not a true Sieve of E.:
http://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf

Bye,
bearophile

Aug 25 2013

"Meta" <jared771 gmail.com> writes:

On Monday, 26 August 2013 at 01:16:21 UTC, Paul Jurczak wrote:
 You still have a chance, because I don't quite get it. With the 
 little I know about Haskell, I find this code very elegant. 
 What is wrong with it? Performance?

It's a huge blowup in time complexity. They say that Lisp 
programmers know the value of everything and the cost of 
nothing... That's probably true of Haskell programmers as well.

Aug 25 2013

"deadalnix" <deadalnix gmail.com> writes:

On Monday, 26 August 2013 at 01:16:21 UTC, Paul Jurczak wrote:
 On Sunday, 25 August 2013 at 23:16:17 UTC, Joseph Rushton 
 Wakeling wrote:
 On 26/08/13 01:06, Andrei Alexandrescu wrote:
 This is one of the worst PR functional programming has ever 
 gotten, and one of
 the worst things FP has done to the larger community. 
 Somebody should do hard
 time for this. And yes, for that matter it's a great example 
 in which SLOCs are
 not a very good measure.

 I thought you'd like me bringing this up. ;-)

 You still have a chance, because I don't quite get it. With the 
 little I know about Haskell, I find this code very elegant. 
 What is wrong with it? Performance?

It is a quick copy copy copy no so quick sort.

Aug 25 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 8/25/13 6:16 PM, Paul Jurczak wrote:
 On Sunday, 25 August 2013 at 23:16:17 UTC, Joseph Rushton Wakeling wrote:
 On 26/08/13 01:06, Andrei Alexandrescu wrote:
 This is one of the worst PR functional programming has ever gotten,
 and one of
 the worst things FP has done to the larger community. Somebody should
 do hard
 time for this. And yes, for that matter it's a great example in which
 SLOCs are
 not a very good measure.

 I thought you'd like me bringing this up. ;-)

 You still have a chance, because I don't quite get it. With the little I
 know about Haskell, I find this code very elegant. What is wrong with
 it? Performance?

I ranted about that a couple of times (including during my DConf talk). 
In brief:

1. quicksort is based on in-place partition, so at best that example is 
a different (and less adequate) algorithm inspired from quicksort.

2. choosing the first element as pivot yields quadratic performance on 
almost sorted sequence, which is a common practical case.

That example is part of a toxic trifecta (also including the 
doubly-exponential fibonacci and the O(N) space factorial) that has done 
a lot of bad teaching.


Andrei

Aug 25 2013

"Paul Jurczak" <pauljurczak yahoo.com> writes:

On Monday, 26 August 2013 at 03:44:30 UTC, Andrei Alexandrescu 
wrote:
 On 8/25/13 6:16 PM, Paul Jurczak wrote:
 On Sunday, 25 August 2013 at 23:16:17 UTC, Joseph Rushton 
 Wakeling wrote:
 On 26/08/13 01:06, Andrei Alexandrescu wrote:
 This is one of the worst PR functional programming has ever 
 gotten,
 and one of
 the worst things FP has done to the larger community. 
 Somebody should
 do hard
 time for this. And yes, for that matter it's a great example 
 in which
 SLOCs are
 not a very good measure.

 I thought you'd like me bringing this up. ;-)

 You still have a chance, because I don't quite get it. With 
 the little I
 know about Haskell, I find this code very elegant. What is 
 wrong with
 it? Performance?

 I ranted about that a couple of times (including during my 
 DConf talk). In brief:

 1. quicksort is based on in-place partition, so at best that 
 example is a different (and less adequate) algorithm inspired 
 from quicksort.

 2. choosing the first element as pivot yields quadratic 
 performance on almost sorted sequence, which is a common 
 practical case.

 That example is part of a toxic trifecta (also including the 
 doubly-exponential fibonacci and the O(N) space factorial) that 
 has done a lot of bad teaching.


 Andrei

 Andrei,  deadalnix,  Meta,  bearophile

I see, performance sucks and this function only kind of looks 
like a quicksort. It makes sense putting this 'quicksort" and 
recursive Fibonacci and factorial in the same bag of ivory tower 
ideas, otherwise known as Haskell. But it looked so nice :-(

Aug 25 2013

Russel Winder <russel winder.org.uk> writes:

On Mon, 2013-08-26 at 00:57 +0200, Joseph Rushton Wakeling wrote:
 On 24/08/13 19:01, Ramon wrote:
 I think that there is a lot speaking against sloc.

 First it's often (ab?)used for "Ha! My language x is better than yours.=


 I can
 write a web server in 3 lines, you need 30".

=20
 Don't know about a web server, but I remember somewhere online I found th=

is=20
 really cool 3-line quicksort that you can do in Haskell :-)
=20
      qsort []     =3D []
      qsort (x:xs) =3D qsort (filter (< x) xs) ++ [x]
                     ++ qsort (filter (>=3D x) xs)

OK so good for the first 20s of a lecture on Quicksort and totally
useless for doing anything properly. Two main reasons:

1. It copies data rather than doing it in situ, should use Mergesort.
2. passes over the data twice instead of once.

This is a perfect example of where minimizing LOC and doing the right
thing are opposed.

--=20
Russel.
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D
Dr Russel Winder      t: +44 20 7585 2200   voip: sip:russel.winder ekiga.n=
et
41 Buckmaster Road    m: +44 7770 465 077   xmpp: russel winder.org.uk
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder

Aug 26 2013

"deadalnix" <deadalnix gmail.com> writes:

On Monday, 26 August 2013 at 12:05:09 UTC, Russel Winder wrote:
 On Mon, 2013-08-26 at 00:57 +0200, Joseph Rushton Wakeling 
 wrote:
 On 24/08/13 19:01, Ramon wrote:
 I think that there is a lot speaking against sloc.

 First it's often (ab?)used for "Ha! My language x is better 
 than yours. I can
 write a web server in 3 lines, you need 30".

 
 Don't know about a web server, but I remember somewhere online 
 I found this really cool 3-line quicksort that you can do in 
 Haskell :-)
 
      qsort []     = []
      qsort (x:xs) = qsort (filter (< x) xs) ++ [x]
                     ++ qsort (filter (>= x) xs)

 OK so good for the first 20s of a lecture on Quicksort and 
 totally
 useless for doing anything properly. Two main reasons:

 1. It copies data rather than doing it in situ, should use 
 Mergesort.
 2. passes over the data twice instead of once.

 This is a perfect example of where minimizing LOC and doing the 
 right
 thing are opposed.

It goes throw data 3 times : to filter (twice) and to to concat 
at the end.

Aug 26 2013

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 26/08/13 14:04, Russel Winder wrote:
 OK so good for the first 20s of a lecture on Quicksort and totally
 useless for doing anything properly. Two main reasons:

 1. It copies data rather than doing it in situ, should use Mergesort.
 2. passes over the data twice instead of once.

 This is a perfect example of where minimizing LOC and doing the right
 thing are opposed.

If anyone hadn't already realized, my citation of this example was a joke based 
on the fact that it's explicitly cited in Andrei's article "On iteration" as an 
example of beautiful-looking code that is terrible in practice.  Or, as he puts 
it, "Emperors clad in boxers". :-)

Lisp- and Haskell-oriented friends have objected to that example as a strawman, 
claiming that everyone knows that in practice you implement QuickSort 
differently, but I think it's an entirely fair critique -- examples like this 
offer false promises about the practical expressiveness of a language.

Aug 26 2013

"Jesse Phillips" <Jesse.K.Phillips+D gmail.com> writes:

On Saturday, 24 August 2013 at 16:19:07 UTC, H. S. Teoh wrote:
 The fault of LOC is precisely that people "fairly intuitively"
 understand it. The problem is that no two people's intuitions 
 ever
 match.  So any conclusions drawn from LOC must necessarily be
 subjective, and really not that much better than saying "I feel 
 like
 language A is better than language B, because I just like it 
 better, and
 besides, it has a nicer color (aka better LOC or whatever other
 subjective metric)."

That sounds to be the exactly what we wish to capture.

 Which is fine, if that's what you're looking for. But let's not 
 pretend
 it has anything to do with the objective quality of a language.

See above. What use is an objective measure when peoples taste 
for a language is subjective?

Aug 24 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 8/24/2013 9:17 AM, H. S. Teoh wrote:
 What's so difficult about running zip on the code?

It's not so easy to run zip on a snippet in a magazine article, as opposed to 
visually just looking at it.


 The fault of LOC is precisely that people "fairly intuitively"
 understand it. The problem is that no two people's intuitions ever
 match.  So any conclusions drawn from LOC must necessarily be
 subjective,

Language comparisons are *always* subjective.

A corollary to that is one can always game any supposedly objective criteria.

Aug 24 2013

"Paul Jurczak" <pauljurczak yahoo.com> writes:

On Saturday, 24 August 2013 at 04:59:41 UTC, H. S. Teoh wrote:
[..]
 A far more reliable measure of code complexity is to look at the
 compressed size of the source code (e.g., with zip), which is an
 approximation of the Kolgomorov complexity of the text, roughly
 equivalent to the amount of information content it has.

[..]

Not always. Using long names vs. short ones will substantially 
inflate zip file size, but will not affect LOC count.

Aug 25 2013

"bearophile" <bearophileHUGS lycos.com> writes:

Paul Jurczak:

 Using long names vs. short ones will substantially inflate zip 
 file size, but will not affect LOC count.

On the other hand if you use a very good compressor (like the 
PPMd of 7Zip or even better a PAQ by the good Matt) the 
identifier names that are mostly a concatenation of English words 
will be compressed to around 1-2 bits/char.

Bye,
bearophile

Aug 25 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, Aug 23, 2013 at 09:58:11PM -0700, H. S. Teoh wrote:
[...]
 A far more reliable measure of code complexity is to look at the
 compressed size of the source code (e.g., with zip), which is an
 approximation of the Kolgomorov complexity of the text, roughly
 equivalent to the amount of information content it has.

[...]

Excuse me, I meant *Kolmogorov* complexity. :)


T

-- 
WINDOWS = Will Install Needless Data On Whole System -- CompuMan

Aug 23 2013

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 24/08/13 06:58, H. S. Teoh wrote:
 In none of the above examples did I try to deliberately game with the
 metric. But the metric is still pretty inaccurate, and requires
 subjective judgment calls.

It's a heuristic, rather than a metric, I'd say.  But as a heuristic it may be 
useful to compare things like

     (i) what's the minimal number of lines needed (not artificially compressing
         them!) to write the algorithm idiomatically?

    (ii) how many more (or less) lines do you need to optimize the code for
         performance?

   (iii) how many more lines do you need to parallelize the code effectively and
         safely?

By the way, for your example:

 	auto formatYear(int year, int monthsPerRow)
 	{
 	    enum colSpacing = 1;
 	    return
 		datesInYear(year)
 		.byMonth()
 		.chunks(monthsPerRow)
 		.map!(r =>
 			r.formatMonths()
 			 .array()
 			 .pasteBlocks(colSpacing)
 			 .join("\n"))
 		.join("\n\n");
 	}

Although obviously there are only 3 _logical_ lines there (the function 
declaration, the enum, and the return statement), here I'd be inclined to say 
the number of _significant_ lines is 12 (I'm ignoring the braces).  The fact 
that you needed to spread the expression out across multiple lines in order to 
be readable does carry useful information about the language, IMO.

That doesn't necessarily say the language is _bad_.  It's difficult to see how
a 
language could reduce the space needed for that expression without also
reducing 
comprehensibility.

Aug 24 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 8/23/2013 6:48 AM, bearophile wrote:
 The author of the serial language comparison has now created a simple parallel
 comparison:

 http://togototo.wordpress.com/2013/08/23/benchmarks-round-two-parallel-go-rust-d-scala-and-nimrod/

And note how well D did in the speed tests!

Aug 23 2013

"Jesse Phillips" <Jesse.K.Phillips+D gmail.com> writes:

On Friday, 23 August 2013 at 19:17:01 UTC, Walter Bright wrote:
 On 8/23/2013 6:48 AM, bearophile wrote:
 The author of the serial language comparison has now created a 
 simple parallel
 comparison:

 http://togototo.wordpress.com/2013/08/23/benchmarks-round-two-parallel-go-rust-d-scala-and-nimrod/

 And note how well D did in the speed tests!

It gets second no matter how you read it!

Aug 23 2013

"Meta" <jared771 gmail.com> writes:

On Saturday, 24 August 2013 at 02:12:41 UTC, Jesse Phillips wrote:
 It gets second no matter how you read it!

It was also not very idiomatic. It looks like some performance 
improvements could be made.

Aug 23 2013

"Marco Leise" <Marco.Leise gmx.de> writes:

On Saturday, 24 August 2013 at 04:22:09 UTC, Meta wrote:
 On Saturday, 24 August 2013 at 02:12:41 UTC, Jesse Phillips 
 wrote:
 It gets second no matter how you read it!

 It was also not very idiomatic. It looks like some performance 
 improvements could be made.

I made it idiomatic, D is on place 1 now by a big margin. See the
'ldc2' entry:
http://togototo.wordpress.com/2013/08/23/benchmarks-round-two-parallel-go-rust-d-scala-and-nimrod/

Nov 07 2013

"bearophile" <bearophileHUGS lycos.com> writes:

Marco Leise:

 I made it idiomatic, D is on place 1 now by a big margin. See 
 the
 'ldc2' entry:
 http://togototo.wordpress.com/2013/08/23/benchmarks-round-two-parallel-go-rust-d-scala-and-nimrod/

Very nice. I have made a more idiomatic version (in D global 
constants don't need to be IN_UPPERCASE), I have added few 
missing immutable annotations, and given the benchmark also 
counts line numbers, I have made the code a little more compact 
(particularly the struct definitions, but I have moves the then 
branch of some if on a new line, increasing the line count to 
make the code a little more readable, so it's not a unnaturally 
compact D code):

http://dpaste.dzfl.pl/d37ba995

Bye,
bearophile

Nov 07 2013

"bearophile" <bearophileHUGS lycos.com> writes:

 http://dpaste.dzfl.pl/d37ba995

That gives me 83 cloc (http://cloc.sourceforge.net ) lines of 
code, so if you submit that code to the benchmark site, make sure 
the line count (currently  	108, despite cloc gives me 101 on it) 
too gets updated.

Bye,
bearophile

Nov 07 2013

Marco Leise <Marco.Leise gmx.de> writes:

Am Thu, 07 Nov 2013 14:19:00 +0100
schrieb "bearophile" <bearophileHUGS lycos.com>:

 http://dpaste.dzfl.pl/d37ba995

 
 That gives me 83 cloc (http://cloc.sourceforge.net ) lines of 
 code, so if you submit that code to the benchmark site, make sure 
 the line count (currently  	108, despite cloc gives me 101 on it) 
 too gets updated.
 
 Bye,
 bearophile

foreach (immutable xi; r.x .. r.x + r.w + 1)

What the heck?! I didn't know that even compiles. :)
About the UPPERCASE_CONSTANTS: I know we tend to use camelCase
for them, too. It's just a personal preference to have global
constants in uppercase.
If you want it submitted please go ahead. My objection is that
you condensed the code too much to create a small SLOC in
comparison to e.g. C++ and moved away from the original coding
style of the benchmark that made SLOC somewhat comparable.

Btw. I also wrote a new algorithm for the given problem, that
gives deterministic "full" levels without resorting to trial
and error and runs a lot faster (when compiled for 64-bit at
least):
http://dpaste.dzfl.pl/d37ba995

-- 
Marco

Nov 07 2013

"bearophile" <bearophileHUGS lycos.com> writes:

Marco Leise:

 foreach (immutable xi; r.x .. r.x + r.w + 1)

 What the heck?! I didn't know that even compiles. :)

It's an enhancement that I requested, and Kenji implemented some 
time ago.


 About the UPPERCASE_CONSTANTS: I know we tend to use camelCase
 for them, too. It's just a personal preference to have global
 constants in uppercase.

In C (and In Python, that doesn't enforce their immutability) I 
too write global UPPERCASE_CONSTANTS, but in D module-level 
constants behave better, so I prefer to use the more camelCase.


 If you want it submitted please go ahead.

OK. (I benchmarked it to make sure it's not slower).


 My objection is that
 you condensed the code too much to create a small SLOC in
 comparison to e.g. C++ and moved away from the original coding
 style of the benchmark that made SLOC somewhat comparable.

The style I have used is very similar to my normal style (I add 
few more braces, few more comments, and little more), so I think 
this D code is not unnatural. Most D entries in Rosettacode are 
written in that style.


 Btw. I also wrote a new algorithm for the given problem, that
 gives deterministic "full" levels without resorting to trial
 and error and runs a lot faster (when compiled for 64-bit at
 least):
 http://dpaste.dzfl.pl/d37ba995

I don't know if the author will use that. Nice code and nice 
popcounts.

Bye,
bearophile

Nov 07 2013

"Daniel Davidson" <nospam spam.com> writes:

On Thursday, 7 November 2013 at 13:12:56 UTC, bearophile wrote:
 Marco Leise:

 I made it idiomatic, D is on place 1 now by a big margin. See 
 the
 'ldc2' entry:
 http://togototo.wordpress.com/2013/08/23/benchmarks-round-two-parallel-go-rust-d-scala-and-nimrod/

 Very nice. I have made a more idiomatic version (in D global 
 constants don't need to be IN_UPPERCASE), I have added few 
 missing immutable annotations, and given the benchmark also 
 counts line numbers, I have made the code a little more compact 
 (particularly the struct definitions, but I have moves the then 
 branch of some if on a new line, increasing the line count to 
 make the code a little more readable, so it's not a unnaturally 
 compact D code):

 http://dpaste.dzfl.pl/d37ba995

 Bye,
 bearophile


Regarding what is idiomatic D, isn't `immutable x = rnd.next % 
levelSize;` pedantic.
Why not just go with `const x = rnd.next % levelSize;`

Any time the type is a fundamental type with no aliasing there is 
no sharing so the differentiation of immutable vs const is 
unnecessary. Might as well go with const. immutable says I or 
no-one else will change the value. But since no-one else can 
change the value it seems overkill, no?

Thanks
Dan

Nov 07 2013

Marco Leise <Marco.Leise gmx.de> writes:

Am Thu, 07 Nov 2013 15:27:57 +0100
schrieb "Daniel Davidson" <nospam spam.com>:

 Regarding what is idiomatic D, isn't `immutable x = rnd.next % 
 levelSize;` pedantic.
 Why not just go with `const x = rnd.next % levelSize;`

Yes it is pedantic and I don't mind if anyone objects. :)

 Any time the type is a fundamental type with no aliasing there is 
 no sharing so the differentiation of immutable vs const is 
 unnecessary. Might as well go with const. immutable says I or 
 no-one else will change the value. But since no-one else can 
 change the value it seems overkill, no?
 
 Thanks
 Dan

Data is either mutable or immutable. The way I see it, const
is just a bridge to combine both worlds when the context
allows for both mutable and immutable data.
Immutable by default would have made the code look less
pedantic, but I could imagine there are big downsides to that
as well.

-- 
Marco

Nov 07 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

07-Nov-2013 18:27, Daniel Davidson пишет:
 On Thursday, 7 November 2013 at 13:12:56 UTC, bearophile wrote:
 Marco Leise:

 I made it idiomatic, D is on place 1 now by a big margin. See the
 'ldc2' entry:
 http://togototo.wordpress.com/2013/08/23/benchmarks-round-two-parallel-go-rust-d-scala-and-nimrod/

 Very nice. I have made a more idiomatic version (in D global constants
 don't need to be IN_UPPERCASE), I have added few missing immutable
 annotations, and given the benchmark also counts line numbers, I have
 made the code a little more compact (particularly the struct
 definitions, but I have moves the then branch of some if on a new
 line, increasing the line count to make the code a little more
 readable, so it's not a unnaturally compact D code):

 http://dpaste.dzfl.pl/d37ba995

 Bye,
 bearophile


 Regarding what is idiomatic D, isn't `immutable x = rnd.next %
 levelSize;` pedantic.
 Why not just go with `const x = rnd.next % levelSize;`

IMHO yes, it's pedantic.

 Any time the type is a fundamental type with no aliasing there is no
 sharing so the differentiation of immutable vs const is unnecessary.
 Might as well go with const. immutable says I or no-one else will change
 the value. But since no-one else can change the value it seems overkill,
 no?

immutable is useful for non-value types. Say strings, then wherever 
function/module it's passed later on can safely avoid copying it since 
it's immutable.


-- 
Dmitry Olshansky

Nov 07 2013

"bearophile" <bearophileHUGS lycos.com> writes:

Dmitry Olshansky:

 Regarding what is idiomatic D, isn't `immutable x = rnd.next %
 levelSize;` pedantic.
 Why not just go with `const x = rnd.next % levelSize;`

 IMHO yes, it's pedantic.

It's a little pedantic, and it's some characters longer than 
"const", but I think it's the good standard to use unless 
something (D implementation bugs, bugs or missing parts in 
Phobos, code logic, etc) prevents you to use it :-)

Bye,
bearophile

Nov 07 2013

"Dicebot" <public dicebot.lv> writes:

On Thursday, 7 November 2013 at 14:27:59 UTC, Daniel Davidson 
wrote:
 Regarding what is idiomatic D, isn't `immutable x = rnd.next % 
 levelSize;` pedantic.
 Why not just go with `const x = rnd.next % levelSize;`

I actually prefer usage of `immutable` by default for value types 
because it is likely to cause a compilation error if later code 
changes to stop conforming value semantics. `const` version is 
likely to still compile with wrong assumption.

Nov 07 2013

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 07/11/13 14:12, bearophile wrote:
 Very nice. I have made a more idiomatic version (in D global constants don't
 need to be IN_UPPERCASE), I have added few missing immutable annotations, and
 given the benchmark also counts line numbers, I have made the code a little
more
 compact (particularly the struct definitions, but I have moves the then branch
 of some if on a new line, increasing the line count to make the code a little
 more readable, so it's not a unnaturally compact D code):

 http://dpaste.dzfl.pl/d37ba995

How does the speed of that code change if instead of the Random struct, you use 
std.random.Xorshift32 ... ?

Nov 09 2013

"bearophile" <bearophileHUGS lycos.com> writes:

Joseph Rushton Wakeling:

 How does the speed of that code change if instead of the Random 
 struct, you use std.random.Xorshift32 ... ?

That change of yours was well studied in the first blog post (the 
serial one) and the performance loss of using Xorshift32 was 
significant, even with LDC2. I don't know why.

Sometimes even moving things (like the Xorshift struct) in 
another module changes the code performance. Performance 
optimization is a bit of an art still. In theory such cases 
should be studied, and the performance loss should be understood 
and fixed :-)

Bye,
bearophile

Nov 09 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

09-Nov-2013 16:23, bearophile пишет:
 Joseph Rushton Wakeling:

 How does the speed of that code change if instead of the Random
 struct, you use std.random.Xorshift32 ... ?

 That change of yours was well studied in the first blog post (the serial
 one) and the performance loss of using Xorshift32 was significant, even
 with LDC2. I don't know why.

Lack of inlining most likely.
https://d.puremagic.com/issues/show_bug.cgi?id=10985

Since Xorshift32 is fully specified there is nothing to instantiate in 
the calling code, hence it may just be linked from Phobos.

Anyhow studying the disassembly of the binary will get the answer.

-- 
Dmitry Olshansky

Nov 09 2013

"logicchains" <jonathan.t.barnard gmail.com> writes:

I imagine (although I haven't checked) that std.random.Xorshift32 
uses the algorithm:

         seed ^= seed << 13;
         seed ^= seed >> 17;
         seed ^= seed << 5;
         return seed;

while the levgen benchmarks use the algorithm:

         seed += seed;
         seed ^= (seed > int.max) ? 0x88888eee : 1;
         return seed;

The former produces better random numbers, but it's possible that 
it may be slower.

Lack of inlining would definitely make a huge difference. I wrote 
an assembly function for Go that was an exact copy of the 
assembly generated by the LLVM at O3, and it was no faster than 
the native Go function, even though the assembly was much better 
(assembly functions aren't inlined in Go). Changing the assembly 
function to generate and return two random numbers, however, 
increased the overall program speed by around 10%, highlighting 
the overhead of function calls and lack of inlining.

On Saturday, 9 November 2013 at 12:23:25 UTC, bearophile wrote:
 Joseph Rushton Wakeling:

 How does the speed of that code change if instead of the 
 Random struct, you use std.random.Xorshift32 ... ?

 That change of yours was well studied in the first blog post 
 (the serial one) and the performance loss of using Xorshift32 
 was significant, even with LDC2. I don't know why.

Nov 09 2013

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 10/11/13 05:31, logicchains wrote:
 The former produces better random numbers, but it's possible that it may be
slower.

Ahh, makes sense.  Where did you get the particular RNG you used?  I don't 
recognize it.

Nov 10 2013

"logicchains" <jonathan.t.barnard gmail.com> writes:

I got it from here: 
https://code.google.com/p/go/source/detail?r=3bf9ffdcca1f9585f28dc
0e4ca1c75ea29e18be. 
Apparently it's a linear feedback shift register, and was used in 
Newsqueak.

On Sunday, 10 November 2013 at 09:42:30 UTC, Joseph Rushton 
Wakeling wrote:
 On 10/11/13 05:31, logicchains wrote:
 The former produces better random numbers, but it's possible 
 that it may be slower.

 Ahh, makes sense.  Where did you get the particular RNG you 
 used?  I don't recognize it.

Nov 10 2013

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 07/11/13 12:47, Marco Leise wrote:
 I made it idiomatic, D is on place 1 now by a big margin. See the
 'ldc2' entry:

Slightly baffled to see "ldc2" and "ldmd2" listed as two separate entries. 
Your 
code will surely achieve similar speed when compiled with ldmd2 and appropriate 
optimization choices .... ?

Nov 07 2013

"bearophile" <bearophileHUGS lycos.com> writes:

Joseph Rushton Wakeling:

 Slightly baffled to see "ldc2" and "ldmd2" listed as two 
 separate entries.  Your code will surely achieve similar speed 
 when compiled with ldmd2 and appropriate optimization choices 
 .... ?

Yes, I think the ldmd2 "entry" should be removed...

Bye,
bearophile

Nov 07 2013

"logicchains" <jonathan.t.barnard gmail.com> writes:

Benchmark author here. I left the ldmd2 entry there to represent 
the performance of the D implementation from the time of the 
benchmark, to highlight that the current D implementation is much 
newer than the others, and that there have been no attempts to 
optimise the C and C++ versions similarly to how the latest D 
version was optimised. If you feel it creates needless confusion 
I can remove it, however, or put a note next to it stating the 
above.

I'm currently working on an OpenGL benchmark, and if anybody 
wants to improve the D implementation they're most welcome: 
https://github.com/logicchains/ParticleBench/blob/master/D.d. 
Currently it's just a straight port of the C version. Conciseness 
will be measured by the compressed source file size this time, 
rather than by the Github SLOC number. Note however that there 
seems to be little scope for improving the speed, as waiting for 
the GPU to render the scene is a significant bottleneck (for 
reference, the unoptimised racket implementation currently runs 
at around 50% of the speed of the -O3 C implementation, which 
obviously wouldn't be possible for something more cpu-bound).

On Thursday, 7 November 2013 at 15:28:13 UTC, bearophile wrote:
 Joseph Rushton Wakeling:

 Slightly baffled to see "ldc2" and "ldmd2" listed as two 
 separate entries.  Your code will surely achieve similar speed 
 when compiled with ldmd2 and appropriate optimization choices 
 .... ?

 Yes, I think the ldmd2 "entry" should be removed...

 Bye,
 bearophile

Nov 07 2013

"bearophile" <bearophileHUGS lycos.com> writes:

logicchains:

 Benchmark author here. I left the ldmd2 entry there to 
 represent the performance of the D implementation from the time 
 of the benchmark, to highlight that the current D 
 implementation is much newer than the others, and that there 
 have been no attempts to optimise the C and C++ versions 
 similarly to how the latest D version was optimised. If you 
 feel it creates needless confusion I can remove it, however, or 
 put a note next to it stating the above.

ldmd2 is not a compiler, it's just a thin wrapper that helps use 
the ldc2 compiler with the same command line options of dmd. So I 
think putting two entries for ldc2 and ldmd2 is a little 
confusing for D newbies, it looks like they are two different 
compilers (also because their performance appears different in 
the table).

I have just written an answer in your good blog to solve the 
small problem you have found in my code. Here is a better 
explanation. If you have code like:

uint[10] data;
foreach (i, ref x; data)
     x = i;

This code works on 32 bit systems, because the index i of an 
array is deduced as a size_t. So it fits inside the array of 
uints. On a 64 system i is still size_t, but it's now 64 bit 
long, and it can't fit. Most integral values (like int and uint) 
in D have a fixed size. While size_t is not fixed in size.

This causes some problems when you want to move 32 bit code to a 
64 bit system.

Some times ago I opened an enhancement request on this topic, but 
perhaps better solutions are needed:
https://d.puremagic.com/issues/show_bug.cgi?id=5063

And the solution suggested by Walter here is not enough:
 auto i = array.length;

It's not a big problem, such mistakes usually don't cause 
significant problems, and sometimes the attempt to avoid the 
problem is worse than the the problem itself.

Bye,
bearophile

Nov 08 2013

"logicchains" <jonathan.t.barnard gmail.com> writes:

That's interesting. Is there a particular reason for using size_t 
for array indexing rather than int?

 uint[10] data;
 foreach (i, ref x; data)
     x = i;

 This code works on 32 bit systems, because the index i of an 
 array is deduced as a size_t. So it fits inside the array of 
 uints. On a 64 system i is still size_t, but it's now 64 bit 
 long, and it can't fit. Most integral values (like int and 
 uint) in D have a fixed size. While size_t is not fixed in size.

 This causes some problems when you want to move 32 bit code to 
 a 64 bit system.

 Some times ago I opened an enhancement request on this topic, 
 but perhaps better solutions are needed:
 https://d.puremagic.com/issues/show_bug.cgi?id=5063

Nov 08 2013

Marco Leise <Marco.Leise gmx.de> writes:

Am Fri, 08 Nov 2013 09:58:38 +0100
schrieb "logicchains" <jonathan.t.barnard gmail.com>:

 That's interesting. Is there a particular reason for using size_t 
 for array indexing rather than int?

It is the natural representation of an array index. It is
unsigned and spans the whole addressable memory area.
The _t indicates that its size depends on the target
architecture.

-- 
Marco

Nov 08 2013

"Dicebot" <public dicebot.lv> writes:

On Friday, 8 November 2013 at 09:37:56 UTC, Marco Leise wrote:
 The _t indicates that its size depends on the target
 architecture.

Erm? I am pretty sure "_t" is just a short form for "type", 
common naming notation from C.

Nov 08 2013

"logicchains" <jonathan.t.barnard gmail.com> writes:

Ah, right. I'll bear it in mind if I'm ever writing 
cross-architectural code in D.

On Friday, 8 November 2013 at 09:37:56 UTC, Marco Leise wrote:
 Am Fri, 08 Nov 2013 09:58:38 +0100
 schrieb "logicchains" <jonathan.t.barnard gmail.com>:

 That's interesting. Is there a particular reason for using 
 size_t for array indexing rather than int?

 It is the natural representation of an array index. It is
 unsigned and spans the whole addressable memory area.
 The _t indicates that its size depends on the target
 architecture.

Nov 08 2013

"bearophile" <bearophileHUGS lycos.com> writes:

Your site counts 90 SLOC for the D entry, that comes from 83 
lines of code plus 7 comment lines. I think you shouldn't count 
the lines of comments, from all the entries.

If you want to count the comments too, then if you want I'll 
submit a 83 lines long D version without comments for your site, 
as in the Scala entry, for a little more fair comparison.

The Scala entry has lines of code like:

case (numLvls, threadNum) => {val rnd = new 
Xorshift32(rand.nextInt); if(!silent) println(s"Thread number 
$threadNum has seed " +  rnd.seed); numLvls -> rnd}

Bye,
bearophile

Nov 08 2013

"logicchains" <jonathan.t.barnard gmail.com> writes:

Okay, I've updated it to 83. The other entries didn't include 
comments, so I didn't bother checking to remove comments from the 
linecount.

On Friday, 8 November 2013 at 13:57:31 UTC, bearophile wrote:
 Your site counts 90 SLOC for the D entry, that comes from 83 
 lines of code plus 7 comment lines. I think you shouldn't count 
 the lines of comments, from all the entries.

 If you want to count the comments too, then if you want I'll 
 submit a 83 lines long D version without comments for your 
 site, as in the Scala entry, for a little more fair comparison.

 The Scala entry has lines of code like:

 case (numLvls, threadNum) => {val rnd = new 
 Xorshift32(rand.nextInt); if(!silent) println(s"Thread number 
 $threadNum has seed " +  rnd.seed); numLvls -> rnd}

 Bye,
 bearophile

Nov 08 2013

"bearophile" <bearophileHUGS lycos.com> writes:

logicchains:

 Okay, I've updated it to 83. The other entries didn't include 
 comments, so I didn't bother checking to remove comments from 
 the linecount.

Thank you :-) I think few comments help the code look more 
natural :-)

Bye,
bearophile

Nov 09 2013

"John Colvin" <john.loughran.colvin gmail.com> writes:

On Friday, 8 November 2013 at 11:47:02 UTC, logicchains wrote:
 Ah, right. I'll bear it in mind if I'm ever writing 
 cross-architectural code in D.

Using size_t as array indices is a c/c++ convention that is also 
relevant to D, it's definitely not a D specific thing. Perhaps it 
is more common here due to it's use in .length for arrays and the 
index in foreach amongst other places in the language.

Nov 08 2013

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 08/11/13 04:13, logicchains wrote:
 Benchmark author here. I left the ldmd2 entry there to represent the
performance
 of the D implementation from the time of the benchmark, to highlight that the
 current D implementation is much newer than the others, and that there have
been
 no attempts to optimise the C and C++ versions similarly to how the latest D
 version was optimised. If you feel it creates needless confusion I can remove
 it, however, or put a note next to it stating the above.

Seems fine to me to compare two different code implementations, but displaying 
things as you have suggests this is a compiler difference.  For proper 
comparison, you should probably compile both codes with ldc2 and the same 
optimizations, and see how they compare.

Nov 09 2013

"Rusty Shackleford" <negative gmail.com> writes:

On Friday, 23 August 2013 at 13:48:39 UTC, bearophile wrote:
 The author of the serial language comparison has now created a 
 simple parallel comparison:

...

Off-topic:
First time hearing of Nimrod, it has a neat GC implementation for
games and similar soft-realtime applications. Being able to step
your GC by small amounts each frame instead of just getting
freezes every few hundred-ish would be handy.

Aug 23 2013

D Programming

C/C++ Programming

Other

digitalmars.D - Parallel Rogue-like benchmark