www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Passing array as const slows down code?

reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
Hello all,

A surprise today when tweaking some code.

I have a function which takes as input an array of values (the values are a 
custom struct).  So, in pseudocode, it's something like this:

     struct MyStruct
     {
         size_t x;
         size_t y;
         double z;
     }

     void foo(MyStruct[] input)
     {
         ...
     }

Since the input isn't modified, I thought I'd tag it as const[1]:

     void foo(const MyStruct[] input) { ... }

... but this turned out to noticeably slow down the program.  (Noticeably as
in, 
a 25-second as opposed to 23-second running time, consistently observed on 
repeated trials of different code versions.)

I'm surprised by this as I would have thought if anything tagging a variable as 
const would if anything have increased the speed, or at worst left it the same 
as before.

Can anyone explain why?  It's no big deal, but I'm curious.

Thanks and best wishes,

     -- Joe

-------
[1] Actually I originally tried marking it as immutable, but it turns out
that's 
a stronger condition that doesn't just apply to what happens _inside_ the
function.
Apr 27 2012
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 27 Apr 2012 11:12:07 -0400, Joseph Rushton Wakeling  
<joseph.wakeling webdrake.net> wrote:

 Hello all,

 A surprise today when tweaking some code.

 I have a function which takes as input an array of values (the values  
 are a custom struct).  So, in pseudocode, it's something like this:

      struct MyStruct
      {
          size_t x;
          size_t y;
          double z;
      }

      void foo(MyStruct[] input)
      {
          ...
      }

 Since the input isn't modified, I thought I'd tag it as const[1]:

      void foo(const MyStruct[] input) { ... }

 ... but this turned out to noticeably slow down the program.   
 (Noticeably as in, a 25-second as opposed to 23-second running time,  
 consistently observed on repeated trials of different code versions.)

 I'm surprised by this as I would have thought if anything tagging a  
 variable as const would if anything have increased the speed, or at  
 worst left it the same as before.

 Can anyone explain why?  It's no big deal, but I'm curious.

const should not affect code generation *at all*, except for name mangling (const MyStruct is a different type from MyStruct), and generating an extra TypeInfo for const MyStruct and const MyStruct[]. Const is purely a compile-time concept. This cannot account for an extra 2 seconds. Something else is happening. Without more of your code, nobody can give a definitive answer except it is not supposed to affect it. -Steve
Apr 27 2012
parent Artur Skawina <art.08.09 gmail.com> writes:
On 05/03/12 16:07, Steven Schveighoffer wrote:
 On Wed, 02 May 2012 16:05:13 -0400, Joseph Rushton Wakeling
<joseph.wakeling webdrake.net> wrote:
 
 On 30/04/12 16:03, Steven Schveighoffer wrote:
 Try removing the ref and see if it goes back. That usage of ref should not
 affect anything (if anything it should be slower, since it's an extra level of
 indirection).

Removing the ref ups the time as described before, but only with GDC (DMD the runtime is the same). It's a real effect.

I'm not familiar with GDC, but as bearophile says, it sounds like a compiler bug. Maybe GDC is making an optimization in one case and not in the other, but both should be equivalently able to get the optimization.
 There is no implicit local copy for const. I have a suspicion that you changed
 two things and forgot about one of them when running your tests.

Really don't think so. You can even check the version history of changes if you like!

IIRC, your original post concerned not-checked in code. Anyway, if you've re-run the tests, this is a confirmation. I'd go ahead and file a bug against GDC. Rest assured, const should not *slow down* anything.

There *was* a GDC bug some time ago, where a 'const' or 'in' function argument prevented inlining. Iain fixed it, so unless an old GDC version is used, it's probably not related. http://forum.dlang.org/thread/jdhb57$10vf$1 digitalmars.com?page=16#post-mailman.33.1325763631.16222.d.gnu:40puremagic.com artur
May 03 2012
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 27/04/12 17:18, Steven Schveighoffer wrote:
 const should not affect code generation *at all*, except for name mangling
 (const MyStruct is a different type from MyStruct), and generating an extra
 TypeInfo for const MyStruct and const MyStruct[]. Const is purely a
compile-time
 concept.

 This cannot account for an extra 2 seconds. Something else is happening.

 Without more of your code, nobody can give a definitive answer except it is not
 supposed to affect it.

The code is here: https://github.com/WebDrake/Dregs/ ... and the particular file concerned is https://github.com/WebDrake/Dregs/blob/master/dregs/codetermine.d You'll see that there are about 8 different functions in there which receive as input an array of type Rating!(UserID, ObjectID, Reputation)[]. These were the inputs I was marking as const.
Apr 27 2012
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 27 Apr 2012 11:33:39 -0400, Joseph Rushton Wakeling  
<joseph.wakeling webdrake.net> wrote:

 On 27/04/12 17:18, Steven Schveighoffer wrote:
 const should not affect code generation *at all*, except for name  
 mangling
 (const MyStruct is a different type from MyStruct), and generating an  
 extra
 TypeInfo for const MyStruct and const MyStruct[]. Const is purely a  
 compile-time
 concept.

 This cannot account for an extra 2 seconds. Something else is happening.

 Without more of your code, nobody can give a definitive answer except  
 it is not
 supposed to affect it.

The code is here: https://github.com/WebDrake/Dregs/ ... and the particular file concerned is https://github.com/WebDrake/Dregs/blob/master/dregs/codetermine.d You'll see that there are about 8 different functions in there which receive as input an array of type Rating!(UserID, ObjectID, Reputation)[]. These were the inputs I was marking as const.

Hm.. you have marked all your functions pure as well. I don't think this will affect anything in the current implementation, but it might. However, I'd expect the opposite (const version is faster) if anything. -Steve
Apr 27 2012
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 27/04/12 18:02, Steven Schveighoffer wrote:
 Hm.. you have marked all your functions pure as well. I don't think this will
 affect anything in the current implementation, but it might. However, I'd
expect
 the opposite (const version is faster) if anything.

Yes, me too, hence confusion. I have to say I'm somewhat confused about whether the pure markings are really accurate. The reputation() modifies the reputationUser_ and reputationObject_ internal arrays and hence the returned values of reputationUser() and reputationObject(). It's true that for a given input to reputation(), the outcome will always be the same, but there's nevertheless mutability involved: if I pass one input to reputation(), and then another, the reputationObject() and ...User() outputs will change as a result. I suppose it's a kind of weak purity, but I'm not too happy with it. I suppose I could get reputation() to _return_ reputationObject and reputationUser instead of having them called via functions; but I'd be worried it would slow everything down substantially. Background to this piece of code: it's a D rewrite of some C++ I wrote a couple of years back as part of a research project. I took quite a lot of time to make the C++ as efficient as possible, so this is a nice test case to see if I can get D to produce similar efficiency while having much more elegantly-written code. It's also just a nice way to learn to be "D-ish", working on a problem that I already understand well. Currently the D code is MUCH more elegant and easy to read than the C++, and runs about 2 seconds slower for the test case implemented here (so on my machine, C++ does it in about 21s as opposed to D's 23), with that difference scaling with problem size. It's a tolerable difference, though it would be nice to see if I can close the gap.
Apr 27 2012
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 27 Apr 2012 12:37:41 -0400, Joseph Rushton Wakeling  
<joseph.wakeling webdrake.net> wrote:

 I have to say I'm somewhat confused about whether the pure markings are  
 really accurate.  The reputation() modifies the reputationUser_ and  
 reputationObject_ internal arrays and hence the returned values of  
 reputationUser() and reputationObject().  It's true that for a given  
 input to reputation(), the outcome will always be the same, but there's  
 nevertheless mutability involved: if I pass one input to reputation(),  
 and then another, the reputationObject() and ...User() outputs will  
 change as a result.

weak purity, I think, is one of the most revolutionary ideas to come from D. Essentially, we get compiler-checked pure functions that can be written imperatively instead of functionally. If the function is weakly pure, then it cannot be optimized based on purity, so I don't think there is any way marking it pure *hurts*. -Steve
Apr 27 2012
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 27/04/12 18:37, Joseph Rushton Wakeling wrote:
 I suppose I could get reputation() to _return_ reputationObject and
 reputationUser instead of having them called via functions; but I'd be worried
 it would slow everything down substantially.

... well what do I know. I tweaked it to do just that, and it runs at exactly the same speed. https://github.com/WebDrake/Dregs/commit/9926094645118ecf03c3bedd57f923b02e2bb7fc
Apr 27 2012
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, April 27, 2012 11:18:26 Steven Schveighoffer wrote:
 const should not affect code generation *at all*, except for name mangling
 (const MyStruct is a different type from MyStruct), and generating an
 extra TypeInfo for const MyStruct and const MyStruct[]. Const is purely a
 compile-time concept.

Thanks to the fact that const is transitive and that it's illegal to cast it away to use mutation, const _can_ affect code optimizations, but I don't know exactly under what circumstances it does in the current compiler. Still, like you, if anything, I'd expect const to be faster rather than slower if there's any speed difference. So, what's happening is a bit weird, and I have no idea what's going on. - Jonathan M Davis
Apr 27 2012
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 27/04/12 19:08, Steven Schveighoffer wrote:
 weak purity, I think, is one of the most revolutionary ideas to come from D.
 Essentially, we get compiler-checked pure functions that can be written
 imperatively instead of functionally.

I do agree with that; it's a very attractive aspect of D that a function or object can be internally imperative but pure as far as the outside world is concerned. Seeing that documented in Andrei's writings was another "Wow!" moment for me about D. :-) What concerned me here was that the way my reputation() function was set up actually did change the public state of the object, which to my mind isn't even weakly pure. But I just wrote a fix which, contrary to my fears, didn't affect the speed of the program.
 If the function is weakly pure, then it cannot be optimized based on purity, so
 I don't think there is any way marking it pure *hurts*.

I was more concerned that the compiler wasn't identifying what to me was a violation of purity. I'm fairly sure I can also find a way to make some of those "nothrow" functions throw an error ...
Apr 27 2012
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 27/04/12 19:23, Jonathan M Davis wrote:
 Thanks to the fact that const is transitive and that it's illegal to cast it
 away to use mutation, const _can_ affect code optimizations, but I don't know
 exactly under what circumstances it does in the current compiler.

I should add that I'm using GDC 4.6.3 here, not DMD. I just checked with the latter and don't see any speed difference with const/non-const. But the DMD compiled code runs about 20s slower in total anyway, so ... ;-)
 Still, like you, if anything, I'd expect const to be faster rather than slower
 if there's any speed difference. So, what's happening is a bit weird, and I
 have no idea what's going on.

Yup. :-\
Apr 27 2012
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 27 Apr 2012 13:25:30 -0400, Joseph Rushton Wakeling  
<joseph.wakeling webdrake.net> wrote:

 On 27/04/12 19:08, Steven Schveighoffer wrote:
 weak purity, I think, is one of the most revolutionary ideas to come  
 from D.
 Essentially, we get compiler-checked pure functions that can be written
 imperatively instead of functionally.

I do agree with that; it's a very attractive aspect of D that a function or object can be internally imperative but pure as far as the outside world is concerned. Seeing that documented in Andrei's writings was another "Wow!" moment for me about D. :-) What concerned me here was that the way my reputation() function was set up actually did change the public state of the object, which to my mind isn't even weakly pure. But I just wrote a fix which, contrary to my fears, didn't affect the speed of the program.

Weakly pure simply means it can be called by a strong-pure function. It has no other benefits except for that, and is basically a normal non-pure function otherwise. It's a weird thing, and it's hard to wrap your mind around. I don't think anyone has ever done it before, so it's hard to explain. But the benefits are *enormous*. Now, instead of pure versions of so many normal functions, much of the standard library can be marked as pure, and callable from either pure or unpure functions. It opens up the whole library to pure functions which would otherwise be a painstaking porting effort.
 If the function is weakly pure, then it cannot be optimized based on  
 purity, so
 I don't think there is any way marking it pure *hurts*.

I was more concerned that the compiler wasn't identifying what to me was a violation of purity. I'm fairly sure I can also find a way to make some of those "nothrow" functions throw an error ...

It depends on what your definition of "purity" is. For D's definition, it's pure, otherwise the compiler would reject it. The definition of pure for D is, it cannot accept any shared data as parameters, and it cannot access any global non-immutable data. That's it. There's no requirement for immutability of parameters or results. In hindsight, D's 'pure' keyword really should be named something else, since it varies from the traditional definition of 'pure'. But it is what it is now, and no changing it. nothrow means it cannot throw an Exception. It can still throw an error, or technically even something else derived from Throwable. This is necessary, because otherwise, nothing could be nothrow (anything might throw an out of memory error). -Steve
Apr 27 2012
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Apr 27, 2012 at 07:25:30PM +0200, Joseph Rushton Wakeling wrote:
 On 27/04/12 19:08, Steven Schveighoffer wrote:
weak purity, I think, is one of the most revolutionary ideas to come
from D.  Essentially, we get compiler-checked pure functions that can
be written imperatively instead of functionally.

I do agree with that; it's a very attractive aspect of D that a function or object can be internally imperative but pure as far as the outside world is concerned. Seeing that documented in Andrei's writings was another "Wow!" moment for me about D. :-) What concerned me here was that the way my reputation() function was set up actually did change the public state of the object, which to my mind isn't even weakly pure. But I just wrote a fix which, contrary to my fears, didn't affect the speed of the program.

As mentioned by others, D internally recognizes two kinds of purity, which is unofficially called "strong purity" and "weak purity". The idea is that strong purity corresponds with what functional programming languages call "pure", whereas weak purity allows mutation of state outside the function, *but only through function parameters*. The idea is that strongly pure functions are allowed to call weakly pure functions, provided any reference parameters passed in never escape the scope of the strongly pure function. For example, a strongly pure function is allowed to pass in pointers to local variables which the weakly pure function can modify through the pointer: pure real computationHelper(real arg1, ref real arg2) { // This mutates arg2, so this function is only weakly // pure arg2 = sin(arg1); return cos(arg1); } pure real complexComputation(real[] args) { real tempValue; ... // Note: this changes tempValue real anotherTempValue = computationHelper(args[0], tempValue); // But since tempValue is only used inside this // function, it does not violate strong purity return tempValue + complicatedFunc(args[1]); } As far as the outside world is concerned, complexComputation is a strongly pure function, because even though computationHelper mutates stuff through its arguments, those mutations are all local to complexComputation, and does not actually touch global world state outside. Weakly pure functions cannot mutate anything that they don't receive through their arguments (the 'this' pointer is considered part of their arguments, it's just implicit), so as long as the strongly pure function never passes in a pointer to the outside world, everything is OK. The motivation for distinguishing between these types of purity is to expand the scope of what the strongly-pure function can call. By allowing weakly pure functions to be called from strongly-pure functions, we open up many more implementation possibilities while still retaining all the benefits of strong purity.
If the function is weakly pure, then it cannot be optimized based on
purity, so I don't think there is any way marking it pure *hurts*.

I was more concerned that the compiler wasn't identifying what to me was a violation of purity. I'm fairly sure I can also find a way to make some of those "nothrow" functions throw an error ...

It's not a violation of purity, it's just "weak purity". If you try to access a global variable, for example, it will trigger an error. And nothrow functions *are* allowed to throw Error objects. That's also a deliberate decision. :-) T -- "640K ought to be enough" -- Bill G., 1984. "The Internet is not a primary goal for PC usage" -- Bill G., 1995. "Linux has no impact on Microsoft's strategy" -- Bill G., 1999.
Apr 27 2012
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 27 Apr 2012 13:23:46 -0400, Jonathan M Davis <jmdavisProg gmx.com>  
wrote:

 On Friday, April 27, 2012 11:18:26 Steven Schveighoffer wrote:
 const should not affect code generation *at all*, except for name  
 mangling
 (const MyStruct is a different type from MyStruct), and generating an
 extra TypeInfo for const MyStruct and const MyStruct[]. Const is purely  
 a
 compile-time concept.

Thanks to the fact that const is transitive and that it's illegal to cast it away to use mutation, const _can_ affect code optimizations, but I don't know exactly under what circumstances it does in the current compiler.

No, it can't. There can easily be another non-const reference to the same data. Pure functions can make more assumptions, based on the types, but it would be a very complex determination in the type system to see if two parameters alias the same data. Real optimization benefits come into play when immutable is there. -Steve
Apr 27 2012
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 27/04/12 20:25, H. S. Teoh wrote:
 On Fri, Apr 27, 2012 at 07:25:30PM +0200, Joseph Rushton Wakeling wrote:
 I was more concerned that the compiler wasn't identifying what to me
 was a violation of purity.  I'm fairly sure I can also find a way to
 make some of those "nothrow" functions throw an error ...

It's not a violation of purity, it's just "weak purity". If you try to access a global variable, for example, it will trigger an error.

Thanks for the extended description of weak purity -- it's been very helpful in understanding the concept better. Is there a particular way in which I can explicitly mark a function as strongly pure?
 And nothrow functions *are* allowed to throw Error objects. That's also
 a deliberate decision. :-)

... yes, as I just found out when I decided to test it 2 minutes ago :-) OTOH I found that with or without the nothrow option, when the -release flag was used in compiling the code, the error was not thrown and the program did not exit -- it just sat there seemingly running but doing nothing. This was unexpected ... The deliberate error was in this case a range exception when accessing an array.
Apr 27 2012
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 27 Apr 2012 14:29:40 -0400, Joseph Rushton Wakeling  
<joseph.wakeling webdrake.net> wrote:

 On 27/04/12 20:25, H. S. Teoh wrote:
 On Fri, Apr 27, 2012 at 07:25:30PM +0200, Joseph Rushton Wakeling wrote:
 I was more concerned that the compiler wasn't identifying what to me
 was a violation of purity.  I'm fairly sure I can also find a way to
 make some of those "nothrow" functions throw an error ...

It's not a violation of purity, it's just "weak purity". If you try to access a global variable, for example, it will trigger an error.

Thanks for the extended description of weak purity -- it's been very helpful in understanding the concept better. Is there a particular way in which I can explicitly mark a function as strongly pure?

No, just make sure all the parameters and the result are either immutable or implicitly castable to immutable (hard to explain this better). Hm... gives me a thought that unit tests should have a helper that allows ensuring this: static assert(isStrongPure!fn); Or maybe __traits(isStrongPure, fn) if it's too difficult to do in a library.
 ... yes, as I just found out when I decided to test it 2 minutes ago  
 :-)  OTOH I found that with or without the nothrow option, when the  
 -release flag was used in compiling the code, the error was not thrown  
 and the program did not exit -- it just sat there seemingly running but  
 doing nothing.  This was unexpected ...

 The deliberate error was in this case a range exception when accessing  
 an array.

array bounds checks and asserts are turned off during -release compilation :) -Steve
Apr 27 2012
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, April 27, 2012 14:26:52 Steven Schveighoffer wrote:
 On Fri, 27 Apr 2012 13:23:46 -0400, Jonathan M Davis <jmdavisProg gmx.com>
 
 wrote:
 On Friday, April 27, 2012 11:18:26 Steven Schveighoffer wrote:
 const should not affect code generation *at all*, except for name
 mangling
 (const MyStruct is a different type from MyStruct), and generating an
 extra TypeInfo for const MyStruct and const MyStruct[]. Const is purely
 a
 compile-time concept.

Thanks to the fact that const is transitive and that it's illegal to cast it away to use mutation, const _can_ affect code optimizations, but I don't know exactly under what circumstances it does in the current compiler.

No, it can't. There can easily be another non-const reference to the same data.

Except that if the variable isn't shared, then in many cases, it doesn't matter that there can be another reference - particularly when combined with pure. So, the compiler definitely should be able to use const for optimizations in at least some cases. It wouldn't surprise me if it doesn't use it for that all that much right now though.
 Pure functions can make more assumptions, based on the types, but
 it would be a very complex determination in the type system to see if two
 parameters alias the same data.

Yes. But that doesn't matter in many cases thanks to D's thread-local by default. With shared, on the other hand, const becomes useless for optimizations for the very reason that you give. And yes, there _are_ cases where const can't be used for optimizations because other references are used which might refer to the same data, but particularly if no other references of the same type are used in a section of code, then the compiler should be able to determine that the object really hasn't changed thanks to const.
 Real optimization benefits come into play when immutable is there.

Definitely. immutable allows for much better optimizations than const does, but there _are_ cases where const allows for optimizations. - Jonathan M Davis
Apr 27 2012
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 27/04/12 20:39, Steven Schveighoffer wrote:
 No, just make sure all the parameters and the result are either immutable or
 implicitly castable to immutable (hard to explain this better).

 Hm... gives me a thought that unit tests should have a helper that allows
 ensuring this:

 static assert(isStrongPure!fn);

 Or maybe __traits(isStrongPure, fn) if it's too difficult to do in a library.

Not worth adding a strongpure (purest?) attribute to enable coders to explicitly mark their expectations?
 array bounds checks and asserts are turned off during -release compilation :)

Maybe I've misunderstood what that means, but I would still have expected the program to stop running at least. What was it doing instead ... ? :-)
Apr 27 2012
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 27 Apr 2012 15:28:48 -0400, Joseph Rushton Wakeling  
<joseph.wakeling webdrake.net> wrote:

 On 27/04/12 20:39, Steven Schveighoffer wrote:
 No, just make sure all the parameters and the result are either  
 immutable or
 implicitly castable to immutable (hard to explain this better).

 Hm... gives me a thought that unit tests should have a helper that  
 allows
 ensuring this:

 static assert(isStrongPure!fn);

 Or maybe __traits(isStrongPure, fn) if it's too difficult to do in a  
 library.

Not worth adding a strongpure (purest?) attribute to enable coders to explicitly mark their expectations?

In most cases it's not that important. strong Pure is simply for optimization. Plus, what that would do is guarantee that the function only compiles if the parameters and return value are implicitly convertible to immutable. There is no other special requirements, and the parameters and return type are sitting right there, next to the strongpure function, it's likely not necessary to re-state the intent. Note that, even if a strong pure function becomes a weak-pure one, it's still valid (and compiles, and runs). It can be viewed similarly to inline, where the compiler makes the decision. Sometimes it's nice to force the issue, but for the most part, the compiler will know better what should be inlined.
 array bounds checks and asserts are turned off during -release  
 compilation :)

Maybe I've misunderstood what that means, but I would still have expected the program to stop running at least. What was it doing instead ... ? :-)

It didn't crash because the memory it accessed was still part of the program's address space. out of bounds references don't have to necessarily point at invalid memory. -Steve
Apr 27 2012
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Apr 27, 2012 at 08:29:40PM +0200, Joseph Rushton Wakeling wrote:
 On 27/04/12 20:25, H. S. Teoh wrote:
On Fri, Apr 27, 2012 at 07:25:30PM +0200, Joseph Rushton Wakeling wrote:
I was more concerned that the compiler wasn't identifying what to me
was a violation of purity.  I'm fairly sure I can also find a way to
make some of those "nothrow" functions throw an error ...

It's not a violation of purity, it's just "weak purity". If you try to access a global variable, for example, it will trigger an error.

Thanks for the extended description of weak purity -- it's been very helpful in understanding the concept better. Is there a particular way in which I can explicitly mark a function as strongly pure?

The "strong/weak" distinction is deduced by the compiler internally.
And nothrow functions *are* allowed to throw Error objects. That's
also a deliberate decision. :-)

... yes, as I just found out when I decided to test it 2 minutes ago :-) OTOH I found that with or without the nothrow option, when the -release flag was used in compiling the code, the error was not thrown and the program did not exit -- it just sat there seemingly running but doing nothing. This was unexpected ... The deliberate error was in this case a range exception when accessing an array.

In -release mode, array bounds checking is turned off for speed reasons. The idea being that before you compile with -release you've already extensively tested your program and are sure that simple bugs like array overruns have been weeded out. T -- IBM = I Blame Microsoft
Apr 27 2012
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, April 27, 2012 14:00:13 H. S. Teoh wrote:
 In -release mode, array bounds checking is turned off for speed reasons.
 The idea being that before you compile with -release you've already
 extensively tested your program and are sure that simple bugs like array
 overruns have been weeded out.

The checks are actually left in for safe code even with -release. You need - noboundscheck to disable themin all code. But yes, many array bounds checks do get stripped out with -release, and you can never rely on them being there, because they can go away, depending on the flags. - Jonathan M Davis
Apr 27 2012
prev sibling next sibling parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Friday, 27 April 2012 at 16:02:00 UTC, Steven Schveighoffer 
wrote:
 On Fri, 27 Apr 2012 11:33:39 -0400, Joseph Rushton Wakeling
 On 27/04/12 17:18, Steven Schveighoffer wrote:


 const should not affect code generation *at all*, except for 
 name mangling
 (const MyStruct is a different type from MyStruct), and 
 generating an extra
 TypeInfo for const MyStruct and const MyStruct[]. Const is 
 purely a compile-time
 concept.

 This cannot account for an extra 2 seconds. Something else is 
 happening.



 The code is here: https://github.com/WebDrake/Dregs/

 You'll see that there are about 8 different functions in there 
 which receive as input an array of type Rating!(UserID, 
 ObjectID, Reputation)[].  These were the inputs I was marking 
 as const.

Hm.. you have marked all your functions pure as well. I don't think this will affect anything in the current implementation, but it might. However, I'd expect the opposite (const version is faster) if anything.

Perhaps I'm wrong, but seeing as your working with a struct I would think the following should be noted. First, pure and nothrow. Nothing wrong with labeling functions as such, but I would recommend removing them and once the whole thing is done and you get it in a finished state that you start adding them back in if you think they qualify. Recently I know I added a safe token to a simple function which of course threw a range exception later. Oddly enough it was because I didn't check if the array was null before accessing it. Second, you are passing an array of structs. Either a fat pointer is being passed, or it's shallow copying; depending on the scenario either could be true. As an array it should be passing a fat pointer but if the compiler has to make certain guarantees based on your tags it may not. Depending on the size of the array, I would think this might be the case. Actually I _think_ it would in this case. Imagine a pure function that gives different results from the input when the input changes during execution. By making a local copy the input can't be changed from the outside and thereby returns that guarantee. Last try adding ref after const; At the off chance it's shallow copying, this should remove that.
Apr 27 2012
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Apr 28, 2012 at 12:29:24AM +0200, Era Scarecrow wrote:
 On Friday, 27 April 2012 at 16:02:00 UTC, Steven Schveighoffer
 wrote:

Hm.. you have marked all your functions pure as well.  I don't think
this will affect anything in the current implementation, but it
might.  However, I'd expect the opposite (const version is faster) if
anything.

Perhaps I'm wrong, but seeing as your working with a struct I would think the following should be noted. First, pure and nothrow. Nothing wrong with labeling functions as such, but I would recommend removing them and once the whole thing is done and you get it in a finished state that you start adding them back in if you think they qualify. Recently I know I added a safe token to a simple function which of course threw a range exception later. Oddly enough it was because I didn't check if the array was null before accessing it.

I recommend the opposite, actually. Most D code by default should be safe (don't know about nothrow though). It's good to mark most things as safe and pure, and let the compiler catch careless mistakes.
  Second, you are passing an array of structs. Either a fat pointer is
  being passed, or it's shallow copying; depending on the scenario
  either could be true. As an array it should be passing a fat pointer
  but if the compiler has to make certain guarantees based on your tags
  it may not. Depending on the size of the array, I would think this
  might be the case.

Dynamic arrays are always passed by reference (i.e. fat pointer). AFAIK the compiler does not change this just because of certain tags on the function.
 Actually I _think_ it would in this case. Imagine a pure function that
 gives different results from the input when the input changes during
 execution. By making a local copy the input can't be changed from the
 outside and thereby returns that guarantee.

No, that's wrong. The compiler checks the code at runtime to prevent impure code from slipping into the binary. It does not do anything to "patch over" impure code to make it pure. T -- Why are you blatanly misspelling "blatant"? -- Branden Robinson
Apr 27 2012
prev sibling next sibling parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Friday, 27 April 2012 at 22:40:46 UTC, H. S. Teoh wrote:
 I recommend the opposite, actually. Most D code by default 
 should be  safe (don't know about nothrow though). It's good to 
 mark most things as  safe and pure, and let the compiler catch 
 careless mistakes.

Your probably right..
 Dynamic arrays are always passed by reference (i.e. fat 
 pointer). AFAIK
 the compiler does not change this just because of certain tags 
 on the function.

That's why i wasn't sure... I was pretty sure it was passed via fat pointer but if adding safe and pure makes it slower, something else is going on. Perhaps just a ton more checks to make sure it doesn't during debug mode?
 No, that's wrong. The compiler checks the code at runtime to 
 prevent
 impure code from slipping into the binary. It does not do 
 anything to
 "patch over" impure code to make it pure.

I'm not going to argue, I don't know enough about the D compiler to know exactly what's going on; Plus I'd rather be wrong than right :) I guess use a profiler and check where your bottlenecks are. The results may surprise you. I've only glanced over the code so I can't offer anything more concrete.
Apr 27 2012
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 28/04/12 00:29, Era Scarecrow wrote:
 Last try adding ref after const; At the off chance it's shallow copying, this
 should remove that.

Ahhh, that works. Thank you! Back story: originally the reputation() function just took the array ratings and made an internal copy, ratings_, which was used by the rest of the code. I took that out in this commit: https://github.com/WebDrake/Dregs/commit/4d2a8a055321c2981a453fc4d82fb781da2ea5c7 ... because I found I got about a 2s speedup. It's exactly the speedup which was removed by adding "const" to the function input, so I presume it's as you say, that this was implicitly creating a local copy.
Apr 27 2012
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Saturday, April 28, 2012 01:26:38 Era Scarecrow wrote:
 On Friday, 27 April 2012 at 22:40:46 UTC, H. S. Teoh wrote:
 I recommend the opposite, actually. Most D code by default
 should be  safe (don't know about nothrow though). It's good to
 mark most things as  safe and pure, and let the compiler catch
 careless mistakes.

Your probably right..
 Dynamic arrays are always passed by reference (i.e. fat
 pointer). AFAIK
 the compiler does not change this just because of certain tags
 on the function.

That's why i wasn't sure... I was pretty sure it was passed via fat pointer but if adding safe and pure makes it slower, something else is going on. Perhaps just a ton more checks to make sure it doesn't during debug mode?

pure and safe have _zero_ affect on the types of function parameters or how they're passed. They allow the compiler to provide additional compile-time checks. The _only_ case that I'm aware of where safe affects code generation is that array bounds checking is not removed in safe code with -release like it is in system and trusted code (-noboundscheck will remove it in all code). pure can affect code optimizations in _calling_ code (e.g. making it so that a strongly pure function is only called once in an expression when it's called multiple times with the same arguments within that expression), but it doesn't affect the function's code generation at all. _Nothing_ affects how arrays are passed beyond ref, out, and in. The type of an array doesn't change due to attributes on the function that it's a parameter for. Arrays are literally structs that looks something like struct DynamicArray(T) { T* ptr; size_t length; } and their semantics for being passed to a function are identical to what you'd expect from any struct with such a definition. http://dlang.org/d-array-article.html - Jonathan M Davis
Apr 27 2012
prev sibling next sibling parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Saturday, 28 April 2012 at 00:04:05 UTC, Joseph Rushton 
Wakeling wrote:
 On 28/04/12 00:29, Era Scarecrow wrote:
 At the off chance it's shallow copying, this should remove 
 that.

Ahhh, that works. Thank you! ... because I found I got about a 2s speedup. It's exactly the speedup which was removed by adding "const" to the function input, so I presume it's as you say, that this was implicitly creating a local copy.

Well glad you found it :) Probably the different 'type' made it force a (useless) conversion while inside the template. Odd that the wrong answer was also the right answer :)
Apr 27 2012
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 27/04/12 20:26, Steven Schveighoffer wrote:
 No, it can't. There can easily be another non-const reference to the same data.
 Pure functions can make more assumptions, based on the types, but it would be a
 very complex determination in the type system to see if two parameters alias
the
 same data.

 Real optimization benefits come into play when immutable is there.

Question on the const/immutable distinction. Given that my function has inputs of (size_t, size_t, Rating[]), how come the size_t's can be made immutable, but the Rating[] can only be made const/const ref? Is it because the size_t's can't conceivably be changed from outside while the function is running, whereas the values in the array in principle can be?
Apr 28 2012
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Joseph Rushton Wakeling:

 ... because I found I got about a 2s speedup.  It's exactly the 
 speedup which was removed by adding "const" to the function 
 input, so I presume it's as you say, that this was implicitly 
 creating a local copy.

I suggest to take a look at the asm in both cases, and compare. Adding "const" doesn't cause local copies. Bye, bearophile
Apr 28 2012
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 28/04/12 14:53, bearophile wrote:
 I suggest to take a look at the asm in both cases, and compare. Adding "const"
 doesn't cause local copies.

I'm afraid I have no idea how to do that, or what to look for.
Apr 28 2012
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday, April 28, 2012 13:17:36 Joseph Rushton Wakeling wrote:
 On 27/04/12 20:26, Steven Schveighoffer wrote:
 No, it can't. There can easily be another non-const reference to the same
 data. Pure functions can make more assumptions, based on the types, but
 it would be a very complex determination in the type system to see if two
 parameters alias the same data.
 
 Real optimization benefits come into play when immutable is there.

Question on the const/immutable distinction. Given that my function has inputs of (size_t, size_t, Rating[]), how come the size_t's can be made immutable, but the Rating[] can only be made const/const ref? Is it because the size_t's can't conceivably be changed from outside while the function is running, whereas the values in the array in principle can be?

size_t is a value type. Arrays are reference types. So, when you pass a size_t, you get a copy, but when you pass an array, you get a slice of the array. You could make that slice const, but you can't make something immutable, because immutable stuff must _always_ be immutable. If you want an immutable array (or array of immutable elements), you must make a copy. If you call idup on an array, you'll get a copy of that array where each of its elements are immutable. And yes, because size_t is copied, it can't be changed from the outside, whereas because Rating[] is only sliced, it can be. const protects against it being altered within the function (for both of them), but doesn't protect the original array from being modified. immutable, on the other hand, is _always_ immutable and can never be mutated. But because of that, you can't convert something to immutable without making a copy (which happens automatically with value types but must be done explicitly with reference types). - Jonathan M Davis
Apr 28 2012
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 27 Apr 2012 20:03:48 -0400, Joseph Rushton Wakeling  
<joseph.wakeling webdrake.net> wrote:

 On 28/04/12 00:29, Era Scarecrow wrote:
 Last try adding ref after const; At the off chance it's shallow  
 copying, this
 should remove that.

Ahhh, that works. Thank you! Back story: originally the reputation() function just took the array ratings and made an internal copy, ratings_, which was used by the rest of the code. I took that out in this commit: https://github.com/WebDrake/Dregs/commit/4d2a8a055321c2981a453fc4d82fb781da2ea5c7 ... because I found I got about a 2s speedup. It's exactly the speedup which was removed by adding "const" to the function input, so I presume it's as you say, that this was implicitly creating a local copy.

Try removing the ref and see if it goes back. That usage of ref should not affect anything (if anything it should be slower, since it's an extra level of indirection). There is no implicit local copy for const. I have a suspicion that you changed two things and forgot about one of them when running your tests. -Steve
Apr 30 2012
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 30/04/12 16:03, Steven Schveighoffer wrote:
 Try removing the ref and see if it goes back. That usage of ref should not
 affect anything (if anything it should be slower, since it's an extra level of
 indirection).

Removing the ref ups the time as described before, but only with GDC (DMD the runtime is the same). It's a real effect.
 There is no implicit local copy for const. I have a suspicion that you changed
 two things and forgot about one of them when running your tests.

Really don't think so. You can even check the version history of changes if you like!
May 02 2012
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Joseph Rushton Wakeling:

 Removing the ref ups the time as described before, but only 
 with GDC (DMD the runtime is the same).  It's a real effect.

 There is no implicit local copy for const. I have a suspicion 
 that you changed
 two things and forgot about one of them when running your 
 tests.

Really don't think so. You can even check the version history of changes if you like!

Then it's seems a compiler bug. Please, minimize it as much as possible, and put it in Bugzilla. Bye, bearophile
May 02 2012
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 02 May 2012 16:05:13 -0400, Joseph Rushton Wakeling  
<joseph.wakeling webdrake.net> wrote:

 On 30/04/12 16:03, Steven Schveighoffer wrote:
 Try removing the ref and see if it goes back. That usage of ref should  
 not
 affect anything (if anything it should be slower, since it's an extra  
 level of
 indirection).

Removing the ref ups the time as described before, but only with GDC (DMD the runtime is the same). It's a real effect.

I'm not familiar with GDC, but as bearophile says, it sounds like a compiler bug. Maybe GDC is making an optimization in one case and not in the other, but both should be equivalently able to get the optimization.
 There is no implicit local copy for const. I have a suspicion that you  
 changed
 two things and forgot about one of them when running your tests.

Really don't think so. You can even check the version history of changes if you like!

IIRC, your original post concerned not-checked in code. Anyway, if you've re-run the tests, this is a confirmation. I'd go ahead and file a bug against GDC. Rest assured, const should not *slow down* anything. -Steve
May 03 2012
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 03/05/12 16:07, Steven Schveighoffer wrote:
 IIRC, your original post concerned not-checked in code. Anyway, if you've
re-run
 the tests, this is a confirmation. I'd go ahead and file a bug against GDC.
Rest
 assured, const should not *slow down* anything.

Yes, with the original post I was hoping there might be a quick answer that would avoid anyone having to go through the details of my code, e.g. a known issue or reason. But I can confirm that the effect is real and consistent; I tested it again just now, and it even scales with the system size. I've made some attempts to construct a minimal example, as opposed to my rather complicated code, but so far have been unable to reproduce the effect with a simple case. :-( What I could do is try and cut away from the existing code and see where the effect vanishes, or prepare a set of single-file versions of the code with no qualifiers, const alone, and const ref respectively. On 03/05/12 16:59, Artur Skawina wrote:
 There *was* a GDC bug some time ago, where a 'const' or 'in' function argument
 prevented inlining. Iain fixed it, so unless an old GDC version is used, it's
 probably not related.

I'm running GDC 4.6.3; I don't know if this fix made it into that release. It would make sense if faulty inlining was responsible, though.
May 04 2012
prev sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 04/05/12 18:07, Joseph Rushton Wakeling wrote:
 On 03/05/12 16:59, Artur Skawina wrote:
 There *was* a GDC bug some time ago, where a 'const' or 'in' function argument
 prevented inlining. Iain fixed it, so unless an old GDC version is used, it's
 probably not related.

I'm running GDC 4.6.3; I don't know if this fix made it into that release. It would make sense if faulty inlining was responsible, though.

Let me clarify that last statement. The core function in the struct is one that does something like this (pseudocode): auto myMainFunc(Rating[] x) { do { myFunc1(x); myFunc2(x); myFunc3(x); /* calculate exit-condition value*/ } while (/* conditions not met */) return results; } So, if I'm putting in const (or const ref) qualifiers, they're being applied to the inputs of myMainFunc, myFunc1, myFunc2 and myFunc3. These 3 are obvious candidates for compiler inlining, so anything which affects the possibility of this happening could be expected to hit performance. (... in the real code, myMainFunc() is called reputation(); the 3 interior functions are respectively userDivergence(), userReputation() and objectReputation().)
May 04 2012