digitalmars.D.learn - Passing array as const slows down code?

Joseph Rushton Wakeling (28/28) Apr 27 2012 Hello all,

Steven Schveighoffer (10/33) Apr 27 2012 const should not affect code generation *at all*, except for name mangli...

Joseph Rushton Wakeling (7/14) Apr 27 2012 The code is here: https://github.com/WebDrake/Dregs/

Steven Schveighoffer (6/26) Apr 27 2012 Hm.. you have marked all your functions pure as well. I don't think thi...

Joseph Rushton Wakeling (24/27) Apr 27 2012 Yes, me too, hence confusion.

Steven Schveighoffer (8/16) Apr 27 2012 weak purity, I think, is one of the most revolutionary ideas to come fro...

Joseph Rushton Wakeling (12/17) Apr 27 2012 I do agree with that; it's a very attractive aspect of D that a function...

Steven Schveighoffer (25/44) Apr 27 2012 Weakly pure simply means it can be called by a strong-pure function. It...

H. S. Teoh (50/70) Apr 27 2012 As mentioned by others, D internally recognizes two kinds of purity,
Joseph Rushton Wakeling (10/18) Apr 27 2012 Thanks for the extended description of weak purity -- it's been very hel...

Steven Schveighoffer (12/31) Apr 27 2012 No, just make sure all the parameters and the result are either immutabl...

Joseph Rushton Wakeling (5/12) Apr 27 2012 Not worth adding a strongpure (purest?) attribute to enable coders to ex...

Steven Schveighoffer (18/38) Apr 27 2012 In most cases it's not that important. strong Pure is simply for

H. S. Teoh (9/34) Apr 27 2012 In -release mode, array bounds checking is turned off for speed reasons.

Joseph Rushton Wakeling (4/7) Apr 27 2012 ... well what do I know. I tweaked it to do just that, and it runs at e...
Era Scarecrow (24/46) Apr 27 2012 Perhaps I'm wrong, but seeing as your working with a struct I

H. S. Teoh (15/43) Apr 27 2012 I recommend the opposite, actually. Most D code by default should be

Era Scarecrow (12/25) Apr 27 2012 That's why i wasn't sure... I was pretty sure it was passed via

Jonathan M Davis (22/39) Apr 27 2012 pure and @safe have _zero_ affect on the types of function parameters or...

Joseph Rushton Wakeling (9/11) Apr 27 2012 Ahhh, that works. Thank you!

Era Scarecrow (5/13) Apr 27 2012 Well glad you found it :) Probably the different 'type' made it
bearophile (5/9) Apr 28 2012 I suggest to take a look at the asm in both cases, and compare.

Joseph Rushton Wakeling (2/4) Apr 28 2012 I'm afraid I have no idea how to do that, or what to look for.

Steven Schveighoffer (8/20) Apr 30 2012 Try removing the ref and see if it goes back. That usage of ref should ...

Joseph Rushton Wakeling (5/10) May 02 2012 Removing the ref ups the time as described before, but only with GDC (DM...

bearophile (5/13) May 02 2012 Then it's seems a compiler bug. Please, minimize it as much as
Steven Schveighoffer (9/22) May 03 2012 I'm not familiar with GDC, but as bearophile says, it sounds like a

Artur Skawina (6/24) May 03 2012 There *was* a GDC bug some time ago, where a 'const' or 'in' function ar...
Joseph Rushton Wakeling (13/19) May 04 2012 Yes, with the original post I was hoping there might be a quick answer t...
Joseph Rushton Wakeling (20/26) May 04 2012 Let me clarify that last statement. The core function in the struct is ...

Jonathan M Davis (8/12) Apr 27 2012 Thanks to the fact that const is transitive and that it's illegal to cas...

Steven Schveighoffer (8/20) Apr 27 2012 No, it can't. There can easily be another non-const reference to the sa...

Jonathan M Davis (16/39) Apr 27 2012 Except that if the variable isn't shared, then in many cases, it doesn't...
Joseph Rushton Wakeling (6/11) Apr 28 2012 Question on the const/immutable distinction. Given that my function has...
Jonathan M Davis (16/31) Apr 28 2012 size_t is a value type. Arrays are reference types. So, when you pass a

Joseph Rushton Wakeling (5/11) Apr 27 2012 I should add that I'm using GDC 4.6.3 here, not DMD. I just checked wit...

Jonathan M Davis (6/10) Apr 27 2012 The checks are actually left in for @safe code even with -release. You n...

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

Hello all,

A surprise today when tweaking some code.

I have a function which takes as input an array of values (the values are a 
custom struct).  So, in pseudocode, it's something like this:

     struct MyStruct
     {
         size_t x;
         size_t y;
         double z;
     }

     void foo(MyStruct[] input)
     {
         ...
     }

Since the input isn't modified, I thought I'd tag it as const[1]:

     void foo(const MyStruct[] input) { ... }

... but this turned out to noticeably slow down the program.  (Noticeably as
in, 
a 25-second as opposed to 23-second running time, consistently observed on 
repeated trials of different code versions.)

I'm surprised by this as I would have thought if anything tagging a variable as 
const would if anything have increased the speed, or at worst left it the same 
as before.

Can anyone explain why?  It's no big deal, but I'm curious.

Thanks and best wishes,

     -- Joe

-------
[1] Actually I originally tried marking it as immutable, but it turns out
that's 
a stronger condition that doesn't just apply to what happens _inside_ the
function.

Apr 27 2012

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 27 Apr 2012 11:12:07 -0400, Joseph Rushton Wakeling  
<joseph.wakeling webdrake.net> wrote:

 Hello all,

 A surprise today when tweaking some code.

 I have a function which takes as input an array of values (the values  
 are a custom struct).  So, in pseudocode, it's something like this:

      struct MyStruct
      {
          size_t x;
          size_t y;
          double z;
      }

      void foo(MyStruct[] input)
      {
          ...
      }

 Since the input isn't modified, I thought I'd tag it as const[1]:

      void foo(const MyStruct[] input) { ... }

 ... but this turned out to noticeably slow down the program.   
 (Noticeably as in, a 25-second as opposed to 23-second running time,  
 consistently observed on repeated trials of different code versions.)

 I'm surprised by this as I would have thought if anything tagging a  
 variable as const would if anything have increased the speed, or at  
 worst left it the same as before.

 Can anyone explain why?  It's no big deal, but I'm curious.

const should not affect code generation *at all*, except for name mangling  
(const MyStruct is a different type from MyStruct), and generating an  
extra TypeInfo for const MyStruct and const MyStruct[].  Const is purely a  
compile-time concept.

This cannot account for an extra 2 seconds.  Something else is happening.

Without more of your code, nobody can give a definitive answer except it  
is not supposed to affect it.

-Steve

Apr 27 2012

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 27/04/12 17:18, Steven Schveighoffer wrote:
 const should not affect code generation *at all*, except for name mangling
 (const MyStruct is a different type from MyStruct), and generating an extra
 TypeInfo for const MyStruct and const MyStruct[]. Const is purely a
compile-time
 concept.

 This cannot account for an extra 2 seconds. Something else is happening.

 Without more of your code, nobody can give a definitive answer except it is not
 supposed to affect it.

The code is here: https://github.com/WebDrake/Dregs/

... and the particular file concerned is
https://github.com/WebDrake/Dregs/blob/master/dregs/codetermine.d

You'll see that there are about 8 different functions in there which receive as 
input an array of type Rating!(UserID, ObjectID, Reputation)[].  These were the 
inputs I was marking as const.

Apr 27 2012

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 27 Apr 2012 11:33:39 -0400, Joseph Rushton Wakeling  
<joseph.wakeling webdrake.net> wrote:

 On 27/04/12 17:18, Steven Schveighoffer wrote:
 const should not affect code generation *at all*, except for name  
 mangling
 (const MyStruct is a different type from MyStruct), and generating an  
 extra
 TypeInfo for const MyStruct and const MyStruct[]. Const is purely a  
 compile-time
 concept.

 This cannot account for an extra 2 seconds. Something else is happening.

 Without more of your code, nobody can give a definitive answer except  
 it is not
 supposed to affect it.

 The code is here: https://github.com/WebDrake/Dregs/

 ... and the particular file concerned is
 https://github.com/WebDrake/Dregs/blob/master/dregs/codetermine.d

 You'll see that there are about 8 different functions in there which  
 receive as input an array of type Rating!(UserID, ObjectID,  
 Reputation)[].  These were the inputs I was marking as const.

Hm.. you have marked all your functions pure as well.  I don't think this  
will affect anything in the current implementation, but it might.   
However, I'd expect the opposite (const version is faster) if anything.

-Steve

Apr 27 2012

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 27/04/12 18:02, Steven Schveighoffer wrote:
 Hm.. you have marked all your functions pure as well. I don't think this will
 affect anything in the current implementation, but it might. However, I'd
expect
 the opposite (const version is faster) if anything.

Yes, me too, hence confusion.

I have to say I'm somewhat confused about whether the pure markings are really 
accurate.  The reputation() modifies the reputationUser_ and reputationObject_ 
internal arrays and hence the returned values of reputationUser() and 
reputationObject().  It's true that for a given input to reputation(), the 
outcome will always be the same, but there's nevertheless mutability involved: 
if I pass one input to reputation(), and then another, the reputationObject() 
and ...User() outputs will change as a result.

I suppose it's a kind of weak purity, but I'm not too happy with it.  I suppose 
I could get reputation() to _return_ reputationObject and reputationUser
instead 
of having them called via functions; but I'd be worried it would slow
everything 
down substantially.

Background to this piece of code: it's a D rewrite of some C++ I wrote a couple 
of years back as part of a research project.  I took quite a lot of time to
make 
the C++ as efficient as possible, so this is a nice test case to see if I can 
get D to produce similar efficiency while having much more elegantly-written 
code.  It's also just a nice way to learn to be "D-ish", working on a problem 
that I already understand well.

Currently the D code is MUCH more elegant and easy to read than the C++, and 
runs about 2 seconds slower for the test case implemented here (so on my 
machine, C++ does it in about 21s as opposed to D's 23), with that difference 
scaling with problem size.  It's a tolerable difference, though it would be
nice 
to see if I can close the gap.

Apr 27 2012

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 27 Apr 2012 12:37:41 -0400, Joseph Rushton Wakeling  
<joseph.wakeling webdrake.net> wrote:

 I have to say I'm somewhat confused about whether the pure markings are  
 really accurate.  The reputation() modifies the reputationUser_ and  
 reputationObject_ internal arrays and hence the returned values of  
 reputationUser() and reputationObject().  It's true that for a given  
 input to reputation(), the outcome will always be the same, but there's  
 nevertheless mutability involved: if I pass one input to reputation(),  
 and then another, the reputationObject() and ...User() outputs will  
 change as a result.

weak purity, I think, is one of the most revolutionary ideas to come from  
D.  Essentially, we get compiler-checked pure functions that can be  
written imperatively instead of functionally.

If the function is weakly pure, then it cannot be optimized based on  
purity, so I don't think there is any way marking it pure *hurts*.

-Steve

Apr 27 2012

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 27/04/12 19:08, Steven Schveighoffer wrote:
 weak purity, I think, is one of the most revolutionary ideas to come from D.
 Essentially, we get compiler-checked pure functions that can be written
 imperatively instead of functionally.

I do agree with that; it's a very attractive aspect of D that a function or 
object can be internally imperative but pure as far as the outside world is 
concerned.  Seeing that documented in Andrei's writings was another "Wow!" 
moment for me about D. :-)

What concerned me here was that the way my reputation() function was set up 
actually did change the public state of the object, which to my mind isn't even 
weakly pure.  But I just wrote a fix which, contrary to my fears, didn't affect 
the speed of the program.

 If the function is weakly pure, then it cannot be optimized based on purity, so
 I don't think there is any way marking it pure *hurts*.

I was more concerned that the compiler wasn't identifying what to me was a 
violation of purity.  I'm fairly sure I can also find a way to make some of 
those "nothrow" functions throw an error ...

Apr 27 2012

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 27 Apr 2012 13:25:30 -0400, Joseph Rushton Wakeling  
<joseph.wakeling webdrake.net> wrote:

 On 27/04/12 19:08, Steven Schveighoffer wrote:
 weak purity, I think, is one of the most revolutionary ideas to come  
 from D.
 Essentially, we get compiler-checked pure functions that can be written
 imperatively instead of functionally.

 I do agree with that; it's a very attractive aspect of D that a function  
 or object can be internally imperative but pure as far as the outside  
 world is concerned.  Seeing that documented in Andrei's writings was  
 another "Wow!" moment for me about D. :-)

 What concerned me here was that the way my reputation() function was set  
 up actually did change the public state of the object, which to my mind  
 isn't even weakly pure.  But I just wrote a fix which, contrary to my  
 fears, didn't affect the speed of the program.

Weakly pure simply means it can be called by a strong-pure function.  It  
has no other benefits except for that, and is basically a normal non-pure  
function otherwise.  It's a weird thing, and it's hard to wrap your mind  
around.  I don't think anyone has ever done it before, so it's hard to  
explain.

But the benefits are *enormous*.  Now, instead of pure versions of so many  
normal functions, much of the standard library can be marked as pure, and  
callable from either pure or unpure functions.  It opens up the whole  
library to pure functions which would otherwise be a painstaking porting  
effort.

 If the function is weakly pure, then it cannot be optimized based on  
 purity, so
 I don't think there is any way marking it pure *hurts*.

 I was more concerned that the compiler wasn't identifying what to me was  
 a violation of purity.  I'm fairly sure I can also find a way to make  
 some of those "nothrow" functions throw an error ...

It depends on what your definition of "purity" is.  For D's definition,  
it's pure, otherwise the compiler would reject it.

The definition of pure for D is, it cannot accept any shared data as  
parameters, and it cannot access any global non-immutable data.  That's  
it.  There's no requirement for immutability of parameters or results.

In hindsight, D's 'pure' keyword really should be named something else,  
since it varies from the traditional definition of 'pure'.  But it is what  
it is now, and no changing it.

nothrow means it cannot throw an Exception.  It can still throw an error,  
or technically even something else derived from Throwable.  This is  
necessary, because otherwise, nothing could be nothrow (anything might  
throw an out of memory error).

-Steve

Apr 27 2012

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, Apr 27, 2012 at 07:25:30PM +0200, Joseph Rushton Wakeling wrote:
 On 27/04/12 19:08, Steven Schveighoffer wrote:
weak purity, I think, is one of the most revolutionary ideas to come
from D.  Essentially, we get compiler-checked pure functions that can
be written imperatively instead of functionally.

 
 I do agree with that; it's a very attractive aspect of D that a
 function or object can be internally imperative but pure as far as the
 outside world is concerned.  Seeing that documented in Andrei's
 writings was another "Wow!" moment for me about D. :-)
 
 What concerned me here was that the way my reputation() function was
 set up actually did change the public state of the object, which to my
 mind isn't even weakly pure.  But I just wrote a fix which, contrary
 to my fears, didn't affect the speed of the program.

As mentioned by others, D internally recognizes two kinds of purity,
which is unofficially called "strong purity" and "weak purity". The idea
is that strong purity corresponds with what functional programming
languages call "pure", whereas weak purity allows mutation of state
outside the function, *but only through function parameters*.

The idea is that strongly pure functions are allowed to call weakly pure
functions, provided any reference parameters passed in never escape the
scope of the strongly pure function. For example, a strongly pure
function is allowed to pass in pointers to local variables which the
weakly pure function can modify through the pointer:

	pure real computationHelper(real arg1, ref real arg2) {
		// This mutates arg2, so this function is only weakly
		// pure
		arg2 = sin(arg1);

		return cos(arg1);
	}

	pure real complexComputation(real[] args) {
		real tempValue;
		...

		// Note: this changes tempValue
		real anotherTempValue = computationHelper(args[0],
					tempValue);

		// But since tempValue is only used inside this
		// function, it does not violate strong purity
		return tempValue + complicatedFunc(args[1]);
	}

As far as the outside world is concerned, complexComputation is a
strongly pure function, because even though computationHelper mutates
stuff through its arguments, those mutations are all local to
complexComputation, and does not actually touch global world state
outside. Weakly pure functions cannot mutate anything that they don't
receive through their arguments (the 'this' pointer is considered part
of their arguments, it's just implicit), so as long as the strongly pure
function never passes in a pointer to the outside world, everything is
OK.

The motivation for distinguishing between these types of purity is to
expand the scope of what the strongly-pure function can call. By
allowing weakly pure functions to be called from strongly-pure
functions, we open up many more implementation possibilities while still
retaining all the benefits of strong purity.


If the function is weakly pure, then it cannot be optimized based on
purity, so I don't think there is any way marking it pure *hurts*.

 
 I was more concerned that the compiler wasn't identifying what to me
 was a violation of purity.  I'm fairly sure I can also find a way to
 make some of those "nothrow" functions throw an error ...

It's not a violation of purity, it's just "weak purity". If you try to
access a global variable, for example, it will trigger an error.

And nothrow functions *are* allowed to throw Error objects. That's also
a deliberate decision. :-)


T

-- 
"640K ought to be enough" -- Bill G., 1984.
"The Internet is not a primary goal for PC usage" -- Bill G., 1995.
"Linux has no impact on Microsoft's strategy" -- Bill G., 1999.

Apr 27 2012

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 27/04/12 20:25, H. S. Teoh wrote:
 On Fri, Apr 27, 2012 at 07:25:30PM +0200, Joseph Rushton Wakeling wrote:
 I was more concerned that the compiler wasn't identifying what to me
 was a violation of purity.  I'm fairly sure I can also find a way to
 make some of those "nothrow" functions throw an error ...

 It's not a violation of purity, it's just "weak purity". If you try to
 access a global variable, for example, it will trigger an error.

Thanks for the extended description of weak purity -- it's been very helpful in 
understanding the concept better.

Is there a particular way in which I can explicitly mark a function as strongly 
pure?

 And nothrow functions *are* allowed to throw Error objects. That's also
 a deliberate decision. :-)

... yes, as I just found out when I decided to test it 2 minutes ago :-)  OTOH
I 
found that with or without the nothrow option, when the -release flag was used 
in compiling the code, the error was not thrown and the program did not exit -- 
it just sat there seemingly running but doing nothing.  This was unexpected ...

The deliberate error was in this case a range exception when accessing an array.

Apr 27 2012

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 27 Apr 2012 14:29:40 -0400, Joseph Rushton Wakeling  
<joseph.wakeling webdrake.net> wrote:

 On 27/04/12 20:25, H. S. Teoh wrote:
 On Fri, Apr 27, 2012 at 07:25:30PM +0200, Joseph Rushton Wakeling wrote:
 I was more concerned that the compiler wasn't identifying what to me
 was a violation of purity.  I'm fairly sure I can also find a way to
 make some of those "nothrow" functions throw an error ...

 It's not a violation of purity, it's just "weak purity". If you try to
 access a global variable, for example, it will trigger an error.

 Thanks for the extended description of weak purity -- it's been very  
 helpful in understanding the concept better.

 Is there a particular way in which I can explicitly mark a function as  
 strongly pure?

No, just make sure all the parameters and the result are either immutable  
or implicitly castable to immutable (hard to explain this better).

Hm... gives me a thought that unit tests should have a helper that allows  
ensuring this:

static assert(isStrongPure!fn);

Or maybe __traits(isStrongPure, fn) if it's too difficult to do in a  
library.

 ... yes, as I just found out when I decided to test it 2 minutes ago  
 :-)  OTOH I found that with or without the nothrow option, when the  
 -release flag was used in compiling the code, the error was not thrown  
 and the program did not exit -- it just sat there seemingly running but  
 doing nothing.  This was unexpected ...

 The deliberate error was in this case a range exception when accessing  
 an array.

array bounds checks and asserts are turned off during -release compilation  
:)

-Steve

Apr 27 2012

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 27/04/12 20:39, Steven Schveighoffer wrote:
 No, just make sure all the parameters and the result are either immutable or
 implicitly castable to immutable (hard to explain this better).

 Hm... gives me a thought that unit tests should have a helper that allows
 ensuring this:

 static assert(isStrongPure!fn);

 Or maybe __traits(isStrongPure, fn) if it's too difficult to do in a library.

Not worth adding a strongpure (purest?) attribute to enable coders to
explicitly 
mark their expectations?

 array bounds checks and asserts are turned off during -release compilation :)

Maybe I've misunderstood what that means, but I would still have expected the 
program to stop running at least.  What was it doing instead ... ? :-)

Apr 27 2012

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 27 Apr 2012 15:28:48 -0400, Joseph Rushton Wakeling  
<joseph.wakeling webdrake.net> wrote:

 On 27/04/12 20:39, Steven Schveighoffer wrote:
 No, just make sure all the parameters and the result are either  
 immutable or
 implicitly castable to immutable (hard to explain this better).

 Hm... gives me a thought that unit tests should have a helper that  
 allows
 ensuring this:

 static assert(isStrongPure!fn);

 Or maybe __traits(isStrongPure, fn) if it's too difficult to do in a  
 library.

 Not worth adding a strongpure (purest?) attribute to enable coders to  
 explicitly mark their expectations?

In most cases it's not that important.  strong Pure is simply for  
optimization.

Plus, what that would do is guarantee that the function only compiles if  
the parameters and return value are implicitly convertible to immutable.   
There is no other special requirements, and the parameters and return type  
are sitting right there, next to the strongpure function, it's likely not  
necessary to re-state the intent.

Note that, even if a strong pure function becomes a weak-pure one, it's  
still valid (and compiles, and runs).

It can be viewed similarly to inline, where the compiler makes the  
decision.  Sometimes it's nice to force the issue, but for the most part,  
the compiler will know better what should be inlined.

 array bounds checks and asserts are turned off during -release  
 compilation :)

 Maybe I've misunderstood what that means, but I would still have  
 expected the program to stop running at least.  What was it doing  
 instead ... ? :-)

It didn't crash because the memory it accessed was still part of the  
program's address space.  out of bounds references don't have to  
necessarily point at invalid memory.

-Steve

Apr 27 2012

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, Apr 27, 2012 at 08:29:40PM +0200, Joseph Rushton Wakeling wrote:
 On 27/04/12 20:25, H. S. Teoh wrote:
On Fri, Apr 27, 2012 at 07:25:30PM +0200, Joseph Rushton Wakeling wrote:
I was more concerned that the compiler wasn't identifying what to me
was a violation of purity.  I'm fairly sure I can also find a way to
make some of those "nothrow" functions throw an error ...

It's not a violation of purity, it's just "weak purity". If you try
to access a global variable, for example, it will trigger an error.

 
 Thanks for the extended description of weak purity -- it's been very
 helpful in understanding the concept better.
 
 Is there a particular way in which I can explicitly mark a function as
 strongly pure?

The "strong/weak" distinction is deduced by the compiler internally.


And nothrow functions *are* allowed to throw Error objects. That's
also a deliberate decision. :-)

 
 ... yes, as I just found out when I decided to test it 2 minutes ago
 :-)  OTOH I found that with or without the nothrow option, when the
 -release flag was used in compiling the code, the error was not thrown
 and the program did not exit -- it just sat there seemingly running
 but doing nothing.  This was unexpected ...
 
 The deliberate error was in this case a range exception when accessing
 an array.

In -release mode, array bounds checking is turned off for speed reasons.
The idea being that before you compile with -release you've already
extensively tested your program and are sure that simple bugs like array
overruns have been weeded out.


T

-- 
IBM = I Blame Microsoft

Apr 27 2012

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 27/04/12 18:37, Joseph Rushton Wakeling wrote:
 I suppose I could get reputation() to _return_ reputationObject and
 reputationUser instead of having them called via functions; but I'd be worried
 it would slow everything down substantially.

... well what do I know.  I tweaked it to do just that, and it runs at exactly 
the same speed.
https://github.com/WebDrake/Dregs/commit/9926094645118ecf03c3bedd57f923b02e2bb7fc

Apr 27 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Friday, 27 April 2012 at 16:02:00 UTC, Steven Schveighoffer 
wrote:
 On Fri, 27 Apr 2012 11:33:39 -0400, Joseph Rushton Wakeling
 On 27/04/12 17:18, Steven Schveighoffer wrote:


 const should not affect code generation *at all*, except for 
 name mangling
 (const MyStruct is a different type from MyStruct), and 
 generating an extra
 TypeInfo for const MyStruct and const MyStruct[]. Const is 
 purely a compile-time
 concept.

 This cannot account for an extra 2 seconds. Something else is 
 happening.



 The code is here: https://github.com/WebDrake/Dregs/

 You'll see that there are about 8 different functions in there 
 which receive as input an array of type Rating!(UserID, 
 ObjectID, Reputation)[].  These were the inputs I was marking 
 as const.

 Hm.. you have marked all your functions pure as well.  I don't 
 think this will affect anything in the current implementation, 
 but it might.  However, I'd expect the opposite (const version 
 is faster) if anything.


  Perhaps I'm wrong, but seeing as your working with a struct I 
would think the following should be noted.

  First, pure and nothrow. Nothing wrong with labeling functions 
as such, but I would recommend removing them and once the whole 
thing is done and you get it in a finished state that you start 
adding them back in if you think they qualify. Recently I know I 
added a  safe token to a simple function which of course threw a 
range exception later. Oddly enough it was because I didn't check 
if the array was null before accessing it.

  Second, you are passing an array of structs. Either a fat 
pointer is being passed, or it's shallow copying; depending on 
the scenario either could be true. As an array it should be 
passing a fat pointer but if the compiler has to make certain 
guarantees based on your tags it may not. Depending on the size 
of the array, I would think this might be the case.

  Actually I _think_ it would in this case. Imagine a pure 
function that gives different results from the input when the 
input changes during execution. By making a local copy the input 
can't be changed from the outside and thereby returns that 
guarantee.

  Last try adding ref after const; At the off chance it's shallow 
copying, this should remove that.

Apr 27 2012

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Sat, Apr 28, 2012 at 12:29:24AM +0200, Era Scarecrow wrote:
 On Friday, 27 April 2012 at 16:02:00 UTC, Steven Schveighoffer
 wrote:

[...]
Hm.. you have marked all your functions pure as well.  I don't think
this will affect anything in the current implementation, but it
might.  However, I'd expect the opposite (const version is faster) if
anything.

 
 
  Perhaps I'm wrong, but seeing as your working with a struct I would
  think the following should be noted.
 
  First, pure and nothrow. Nothing wrong with labeling functions as
  such, but I would recommend removing them and once the whole thing is
  done and you get it in a finished state that you start adding them
  back in if you think they qualify. Recently I know I added a  safe
  token to a simple function which of course threw a range exception
  later. Oddly enough it was because I didn't check if the array was
  null before accessing it.

I recommend the opposite, actually. Most D code by default should be
 safe (don't know about nothrow though). It's good to mark most things
as  safe and pure, and let the compiler catch careless mistakes.


  Second, you are passing an array of structs. Either a fat pointer is
  being passed, or it's shallow copying; depending on the scenario
  either could be true. As an array it should be passing a fat pointer
  but if the compiler has to make certain guarantees based on your tags
  it may not. Depending on the size of the array, I would think this
  might be the case.

Dynamic arrays are always passed by reference (i.e. fat pointer). AFAIK
the compiler does not change this just because of certain tags on the
function.


 Actually I _think_ it would in this case. Imagine a pure function that
 gives different results from the input when the input changes during
 execution. By making a local copy the input can't be changed from the
 outside and thereby returns that guarantee.

[...]

No, that's wrong. The compiler checks the code at runtime to prevent
impure code from slipping into the binary. It does not do anything to
"patch over" impure code to make it pure.


T

-- 
Why are you blatanly misspelling "blatant"? -- Branden Robinson

Apr 27 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Friday, 27 April 2012 at 22:40:46 UTC, H. S. Teoh wrote:
 I recommend the opposite, actually. Most D code by default 
 should be  safe (don't know about nothrow though). It's good to 
 mark most things as  safe and pure, and let the compiler catch 
 careless mistakes.

  Your probably right..

 Dynamic arrays are always passed by reference (i.e. fat 
 pointer). AFAIK
 the compiler does not change this just because of certain tags 
 on the function.

  That's why i wasn't sure... I was pretty sure it was passed via 
fat pointer but if adding  safe and pure makes it slower, 
something else is going on. Perhaps just a ton more checks to 
make sure it doesn't during debug mode?

 No, that's wrong. The compiler checks the code at runtime to 
 prevent
 impure code from slipping into the binary. It does not do 
 anything to
 "patch over" impure code to make it pure.

  I'm not going to argue, I don't know enough about the D compiler 
to know exactly what's going on; Plus I'd rather be wrong than 
right :)

  I guess use a profiler and check where your bottlenecks are. The 
results may surprise you. I've only glanced over the code so I 
can't offer anything more concrete.

Apr 27 2012

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Saturday, April 28, 2012 01:26:38 Era Scarecrow wrote:
 On Friday, 27 April 2012 at 22:40:46 UTC, H. S. Teoh wrote:
 I recommend the opposite, actually. Most D code by default
 should be  safe (don't know about nothrow though). It's good to
 mark most things as  safe and pure, and let the compiler catch
 careless mistakes.

 
 Your probably right..
 
 Dynamic arrays are always passed by reference (i.e. fat
 pointer). AFAIK
 the compiler does not change this just because of certain tags
 on the function.

 
 That's why i wasn't sure... I was pretty sure it was passed via
 fat pointer but if adding  safe and pure makes it slower,
 something else is going on. Perhaps just a ton more checks to
 make sure it doesn't during debug mode?

pure and  safe have _zero_ affect on the types of function parameters or how 
they're passed. They allow the compiler to provide additional compile-time 
checks. The _only_ case that I'm aware of where  safe affects code generation 
is that array bounds checking is not removed in  safe code with -release like 
it is in  system and  trusted code (-noboundscheck will remove it in all 
code). pure can affect code optimizations in _calling_ code (e.g. making it so 
that a strongly pure function is only called once in an expression when it's 
called multiple times with the same arguments within that expression), but it 
doesn't affect the function's code generation at all.

_Nothing_ affects how arrays are passed beyond ref, out, and in. The type of an 
array doesn't change due to attributes on the function that it's a parameter 
for. Arrays are literally structs that looks something like

struct DynamicArray(T)
{
 T* ptr;
 size_t length;
}

and their semantics for being passed to a function are identical to what you'd 
expect from any struct with such a definition.

http://dlang.org/d-array-article.html

- Jonathan M Davis

Apr 27 2012

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 28/04/12 00:29, Era Scarecrow wrote:
 Last try adding ref after const; At the off chance it's shallow copying, this
 should remove that.

Ahhh, that works.  Thank you!

Back story: originally the reputation() function just took the array ratings
and 
made an internal copy, ratings_, which was used by the rest of the code.  I
took 
that out in this commit: 
https://github.com/WebDrake/Dregs/commit/4d2a8a055321c2981a453fc4d82fb781da2ea5c7

... because I found I got about a 2s speedup.  It's exactly the speedup which 
was removed by adding "const" to the function input, so I presume it's as you 
say, that this was implicitly creating a local copy.

Apr 27 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Saturday, 28 April 2012 at 00:04:05 UTC, Joseph Rushton 
Wakeling wrote:
 On 28/04/12 00:29, Era Scarecrow wrote:
 At the off chance it's shallow copying, this should remove 
 that.

 Ahhh, that works.  Thank you!

 ... because I found I got about a 2s speedup.  It's exactly the 
 speedup which was removed by adding "const" to the function 
 input, so I presume it's as you say, that this was implicitly 
 creating a local copy.

  Well glad you found it :) Probably the different 'type' made it 
force a (useless) conversion while inside the template.

  Odd that the wrong answer was also the right answer :)

Apr 27 2012

"bearophile" <bearophileHUGS lycos.com> writes:

Joseph Rushton Wakeling:

 ... because I found I got about a 2s speedup.  It's exactly the 
 speedup which was removed by adding "const" to the function 
 input, so I presume it's as you say, that this was implicitly 
 creating a local copy.

I suggest to take a look at the asm in both cases, and compare. 
Adding "const" doesn't cause local copies.

Bye,
bearophile

Apr 28 2012

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 28/04/12 14:53, bearophile wrote:
 I suggest to take a look at the asm in both cases, and compare. Adding "const"
 doesn't cause local copies.

I'm afraid I have no idea how to do that, or what to look for.

Apr 28 2012

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 27 Apr 2012 20:03:48 -0400, Joseph Rushton Wakeling  
<joseph.wakeling webdrake.net> wrote:

 On 28/04/12 00:29, Era Scarecrow wrote:
 Last try adding ref after const; At the off chance it's shallow  
 copying, this
 should remove that.

 Ahhh, that works.  Thank you!

 Back story: originally the reputation() function just took the array  
 ratings and made an internal copy, ratings_, which was used by the rest  
 of the code.  I took that out in this commit:  
 https://github.com/WebDrake/Dregs/commit/4d2a8a055321c2981a453fc4d82fb781da2ea5c7

 ... because I found I got about a 2s speedup.  It's exactly the speedup  
 which was removed by adding "const" to the function input, so I presume  
 it's as you say, that this was implicitly creating a local copy.

Try removing the ref and see if it goes back.  That usage of ref should  
not affect anything (if anything it should be slower, since it's an extra  
level of indirection).

There is no implicit local copy for const.  I have a suspicion that you  
changed two things and forgot about one of them when running your tests.

-Steve

Apr 30 2012

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 30/04/12 16:03, Steven Schveighoffer wrote:
 Try removing the ref and see if it goes back. That usage of ref should not
 affect anything (if anything it should be slower, since it's an extra level of
 indirection).

Removing the ref ups the time as described before, but only with GDC (DMD the 
runtime is the same).  It's a real effect.

 There is no implicit local copy for const. I have a suspicion that you changed
 two things and forgot about one of them when running your tests.

Really don't think so.  You can even check the version history of changes if
you 
like!

May 02 2012

"bearophile" <bearophileHUGS lycos.com> writes:

Joseph Rushton Wakeling:

 Removing the ref ups the time as described before, but only 
 with GDC (DMD the runtime is the same).  It's a real effect.

 There is no implicit local copy for const. I have a suspicion 
 that you changed
 two things and forgot about one of them when running your 
 tests.

 Really don't think so.  You can even check the version history 
 of changes if you like!

Then it's seems a compiler bug. Please, minimize it as much as
possible, and put it in Bugzilla.

Bye,
bearophile

May 02 2012

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Wed, 02 May 2012 16:05:13 -0400, Joseph Rushton Wakeling  
<joseph.wakeling webdrake.net> wrote:

 On 30/04/12 16:03, Steven Schveighoffer wrote:
 Try removing the ref and see if it goes back. That usage of ref should  
 not
 affect anything (if anything it should be slower, since it's an extra  
 level of
 indirection).

 Removing the ref ups the time as described before, but only with GDC  
 (DMD the runtime is the same).  It's a real effect.

I'm not familiar with GDC, but as bearophile says, it sounds like a  
compiler bug.  Maybe GDC is making an optimization in one case and not in  
the other, but both should be equivalently able to get the optimization.

 There is no implicit local copy for const. I have a suspicion that you  
 changed
 two things and forgot about one of them when running your tests.

 Really don't think so.  You can even check the version history of  
 changes if you like!

IIRC, your original post concerned not-checked in code.  Anyway, if you've  
re-run the tests, this is a confirmation.  I'd go ahead and file a bug  
against GDC.  Rest assured, const should not *slow down* anything.

-Steve

May 03 2012

Artur Skawina <art.08.09 gmail.com> writes:

On 05/03/12 16:07, Steven Schveighoffer wrote:
 On Wed, 02 May 2012 16:05:13 -0400, Joseph Rushton Wakeling
<joseph.wakeling webdrake.net> wrote:
 
 On 30/04/12 16:03, Steven Schveighoffer wrote:
 Try removing the ref and see if it goes back. That usage of ref should not
 affect anything (if anything it should be slower, since it's an extra level of
 indirection).

 Removing the ref ups the time as described before, but only with GDC (DMD the
runtime is the same).  It's a real effect.

 
 I'm not familiar with GDC, but as bearophile says, it sounds like a compiler
bug.  Maybe GDC is making an optimization in one case and not in the other, but
both should be equivalently able to get the optimization.
 
 There is no implicit local copy for const. I have a suspicion that you changed
 two things and forgot about one of them when running your tests.

 Really don't think so.  You can even check the version history of changes if
you like!

 
 IIRC, your original post concerned not-checked in code.  Anyway, if you've
re-run the tests, this is a confirmation.  I'd go ahead and file a bug against
GDC.  Rest assured, const should not *slow down* anything.

There *was* a GDC bug some time ago, where a 'const' or 'in' function argument
prevented inlining. Iain fixed it, so unless an old GDC version is used, it's
probably not related.

http://forum.dlang.org/thread/jdhb57$10vf$1 digitalmars.com?page=16#post-mailman.33.1325763631.16222.d.gnu:40puremagic.com

artur

May 03 2012

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 03/05/12 16:07, Steven Schveighoffer wrote:
 IIRC, your original post concerned not-checked in code. Anyway, if you've
re-run
 the tests, this is a confirmation. I'd go ahead and file a bug against GDC.
Rest
 assured, const should not *slow down* anything.

Yes, with the original post I was hoping there might be a quick answer that 
would avoid anyone having to go through the details of my code, e.g. a known 
issue or reason.  But I can confirm that the effect is real and consistent; I 
tested it again just now, and it even scales with the system size.

I've made some attempts to construct a minimal example, as opposed to my rather 
complicated code, but so far have been unable to reproduce the effect with a 
simple case. :-(  What I could do is try and cut away from the existing code
and 
see where the effect vanishes, or prepare a set of single-file versions of the 
code with no qualifiers, const alone, and const ref respectively.

On 03/05/12 16:59, Artur Skawina wrote:
 There *was* a GDC bug some time ago, where a 'const' or 'in' function argument
 prevented inlining. Iain fixed it, so unless an old GDC version is used, it's
 probably not related.

I'm running GDC 4.6.3; I don't know if this fix made it into that release.  It 
would make sense if faulty inlining was responsible, though.

May 04 2012

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 04/05/12 18:07, Joseph Rushton Wakeling wrote:
 On 03/05/12 16:59, Artur Skawina wrote:
 There *was* a GDC bug some time ago, where a 'const' or 'in' function argument
 prevented inlining. Iain fixed it, so unless an old GDC version is used, it's
 probably not related.

 I'm running GDC 4.6.3; I don't know if this fix made it into that release. It
 would make sense if faulty inlining was responsible, though.

Let me clarify that last statement.  The core function in the struct is one
that 
does something like this (pseudocode):

     auto myMainFunc(Rating[] x)
     {
         do {
             myFunc1(x);
             myFunc2(x);
             myFunc3(x);
             /* calculate exit-condition value*/
         } while (/* conditions not met */)

         return results;
     }

So, if I'm putting in const (or const ref) qualifiers, they're being applied to 
the inputs of myMainFunc, myFunc1, myFunc2 and myFunc3.  These 3 are obvious 
candidates for compiler inlining, so anything which affects the possibility of 
this happening could be expected to hit performance.

(... in the real code, myMainFunc() is called reputation(); the 3 interior 
functions are respectively userDivergence(), userReputation() and 
objectReputation().)

May 04 2012

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Friday, April 27, 2012 11:18:26 Steven Schveighoffer wrote:
 const should not affect code generation *at all*, except for name mangling
 (const MyStruct is a different type from MyStruct), and generating an
 extra TypeInfo for const MyStruct and const MyStruct[]. Const is purely a
 compile-time concept.

Thanks to the fact that const is transitive and that it's illegal to cast it 
away to use mutation, const _can_ affect code optimizations, but I don't know 
exactly under what circumstances it does in the current compiler.

Still, like you, if anything, I'd expect const to be faster rather than slower 
if there's any speed difference. So, what's happening is a bit weird, and I 
have no idea what's going on.

- Jonathan M Davis

Apr 27 2012

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 27 Apr 2012 13:23:46 -0400, Jonathan M Davis <jmdavisProg gmx.com>  
wrote:

 On Friday, April 27, 2012 11:18:26 Steven Schveighoffer wrote:
 const should not affect code generation *at all*, except for name  
 mangling
 (const MyStruct is a different type from MyStruct), and generating an
 extra TypeInfo for const MyStruct and const MyStruct[]. Const is purely  
 a
 compile-time concept.

 Thanks to the fact that const is transitive and that it's illegal to  
 cast it
 away to use mutation, const _can_ affect code optimizations, but I don't  
 know
 exactly under what circumstances it does in the current compiler.

No, it can't.  There can easily be another non-const reference to the same  
data.  Pure functions can make more assumptions, based on the types, but  
it would be a very complex determination in the type system to see if two  
parameters alias the same data.

Real optimization benefits come into play when immutable is there.

-Steve

Apr 27 2012

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Friday, April 27, 2012 14:26:52 Steven Schveighoffer wrote:
 On Fri, 27 Apr 2012 13:23:46 -0400, Jonathan M Davis <jmdavisProg gmx.com>
 
 wrote:
 On Friday, April 27, 2012 11:18:26 Steven Schveighoffer wrote:
 const should not affect code generation *at all*, except for name
 mangling
 (const MyStruct is a different type from MyStruct), and generating an
 extra TypeInfo for const MyStruct and const MyStruct[]. Const is purely
 a
 compile-time concept.

 
 Thanks to the fact that const is transitive and that it's illegal to
 cast it
 away to use mutation, const _can_ affect code optimizations, but I don't
 know
 exactly under what circumstances it does in the current compiler.

 
 No, it can't. There can easily be another non-const reference to the same
 data.

Except that if the variable isn't shared, then in many cases, it doesn't 
matter that there can be another reference - particularly when combined with 
pure. So, the compiler definitely should be able to use const for optimizations 
in at least some cases. It wouldn't surprise me if it doesn't use it for that 
all that much right now though.

 Pure functions can make more assumptions, based on the types, but
 it would be a very complex determination in the type system to see if two
 parameters alias the same data.

Yes. But that doesn't matter in many cases thanks to D's thread-local by 
default. With shared, on the other hand, const becomes useless for 
optimizations for the very reason that you give. And yes, there _are_ cases 
where const can't be used for optimizations because other references are used 
which might refer to the same data, but particularly if no other references of 
the same type are used in a section of code, then the compiler should be able 
to determine that the object really hasn't changed thanks to const.

 Real optimization benefits come into play when immutable is there.

Definitely. immutable allows for much better optimizations than const does, but 
there _are_ cases where const allows for optimizations.

- Jonathan M Davis

Apr 27 2012

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 27/04/12 20:26, Steven Schveighoffer wrote:
 No, it can't. There can easily be another non-const reference to the same data.
 Pure functions can make more assumptions, based on the types, but it would be a
 very complex determination in the type system to see if two parameters alias
the
 same data.

 Real optimization benefits come into play when immutable is there.

Question on the const/immutable distinction.  Given that my function has inputs 
of (size_t, size_t, Rating[]), how come the size_t's can be made immutable, but 
the Rating[] can only be made const/const ref?

Is it because the size_t's can't conceivably be changed from outside while the 
function is running, whereas the values in the array in principle can be?

Apr 28 2012

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Saturday, April 28, 2012 13:17:36 Joseph Rushton Wakeling wrote:
 On 27/04/12 20:26, Steven Schveighoffer wrote:
 No, it can't. There can easily be another non-const reference to the same
 data. Pure functions can make more assumptions, based on the types, but
 it would be a very complex determination in the type system to see if two
 parameters alias the same data.
 
 Real optimization benefits come into play when immutable is there.

 
 Question on the const/immutable distinction.  Given that my function has
 inputs of (size_t, size_t, Rating[]), how come the size_t's can be made
 immutable, but the Rating[] can only be made const/const ref?
 
 Is it because the size_t's can't conceivably be changed from outside while
 the function is running, whereas the values in the array in principle can
 be?

size_t is a value type. Arrays are reference types. So, when you pass a 
size_t, you get a copy, but when you pass an array, you get a slice of the 
array. You could make that slice const, but you can't make something 
immutable, because immutable stuff must _always_ be immutable. If you want an 
immutable array (or array of immutable elements), you must make a copy. If you 
call idup on an array, you'll get a copy of that array where each of its 
elements are immutable.

And yes, because size_t is copied, it can't be changed from the outside, 
whereas because Rating[] is only sliced, it can be. const protects against it 
being altered within the function (for both of them), but doesn't protect the 
original array from being modified. immutable, on the other hand, is _always_ 
immutable and can never be mutated. But because of that, you can't convert 
something to immutable without making a copy (which happens automatically with 
value types but must be done explicitly with reference types).

- Jonathan M Davis

Apr 28 2012

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 27/04/12 19:23, Jonathan M Davis wrote:
 Thanks to the fact that const is transitive and that it's illegal to cast it
 away to use mutation, const _can_ affect code optimizations, but I don't know
 exactly under what circumstances it does in the current compiler.

I should add that I'm using GDC 4.6.3 here, not DMD.  I just checked with the 
latter and don't see any speed difference with const/non-const.  But the DMD 
compiled code runs about 20s slower in total anyway, so ... ;-)

 Still, like you, if anything, I'd expect const to be faster rather than slower
 if there's any speed difference. So, what's happening is a bit weird, and I
 have no idea what's going on.

Yup. :-\

Apr 27 2012

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Friday, April 27, 2012 14:00:13 H. S. Teoh wrote:
 In -release mode, array bounds checking is turned off for speed reasons.
 The idea being that before you compile with -release you've already
 extensively tested your program and are sure that simple bugs like array
 overruns have been weeded out.

The checks are actually left in for  safe code even with -release. You need -
noboundscheck to disable themin all code. But yes, many array bounds checks do 
get stripped out with -release, and you can never rely on them being there, 
because they can go away, depending on the flags.

- Jonathan M Davis

Apr 27 2012

D Programming

C/C++ Programming

Other

digitalmars.D.learn - Passing array as const slows down code?