www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Array bound checks removal increasing importance

reply "bearophile" <bearophileHUGS lycos.com> writes:
My opinions about D array bound checks are slowly changing. I 
still think "-boundscheck=off" is useful and good to have. But 
now I am giving increasing importance to compiler logic that 
optimizes away bound checks safely. People more and more want a 
safe system language, more and more persons don't disable array 
bound tests. This means that optimizing away bound checks is 
becoming more and more important in D. And D can't ignore this 
need any more. Even adding logic to remove 20% of the bound 
checks in numeric code is going to help D because I think more 
and more people will not disable bound checks in D.


The following notes are derived from a post by Chris in D.learn:
http://forum.dlang.org/thread/kwkccgdymkpbpyzolumy forum.dlang.org


Is it possible to optimize away all array bound checks in code 
like this?

void main() {
     int[5] data;
     foreach (const i; 0 .. 10)
         data[i] = 0;
     foreach (immutable i; 0 .. 10)
         data[i] = 0;
     int[10] big;
     foreach (const i, x; big)
         data[i] = x;
}


But the compiler must recognize this as correct code:


void main() {
     int[5] data;
     foreach (const i; 0 .. 10)
         if (i < 5)
             data[i] = 0;
}


Can this logic be added to D?


More info on the topic:
http://en.wikipedia.org/wiki/Bounds-checking_elimination
http://ssw.jku.at/Research/Papers/Wuerthinger07/Wuerthinger07.pdf

Bye,
bearophile
May 31 2014
next sibling parent reply Rikki Cattermole <alphaglosined gmail.com> writes:
On 31/05/2014 10:56 p.m., bearophile wrote:
 My opinions about D array bound checks are slowly changing. I still
 think "-boundscheck=off" is useful and good to have. But now I am giving
 increasing importance to compiler logic that optimizes away bound checks
 safely. People more and more want a safe system language, more and more
 persons don't disable array bound tests. This means that optimizing away
 bound checks is becoming more and more important in D. And D can't
 ignore this need any more. Even adding logic to remove 20% of the bound
 checks in numeric code is going to help D because I think more and more
 people will not disable bound checks in D.


 The following notes are derived from a post by Chris in D.learn:
 http://forum.dlang.org/thread/kwkccgdymkpbpyzolumy forum.dlang.org


 Is it possible to optimize away all array bound checks in code like this?

 void main() {
      int[5] data;
      foreach (const i; 0 .. 10)
          data[i] = 0;
      foreach (immutable i; 0 .. 10)
          data[i] = 0;
      int[10] big;
      foreach (const i, x; big)
          data[i] = x;
 }


 But the compiler must recognize this as correct code:


 void main() {
      int[5] data;
      foreach (const i; 0 .. 10)
          if (i < 5)
              data[i] = 0;
 }


 Can this logic be added to D?


 More info on the topic:
 http://en.wikipedia.org/wiki/Bounds-checking_elimination
 http://ssw.jku.at/Research/Papers/Wuerthinger07/Wuerthinger07.pdf

 Bye,
 bearophile

The first two foreach statements assignment statements should be compile errors. I'm actually a little bit surprised that we don't already test for this. But I spose that would actually be quite hard. Perhaps if the foreach statement value is ctfe'able we can compare it upon assignment as a value. Hmm I wonder if there is any more CTFE'able tricks we can do in the compiler.. to improve error checking.
May 31 2014
parent Rikki Cattermole <alphaglosined gmail.com> writes:
On 1/06/2014 12:04 a.m., bearophile wrote:
 Rikki Cattermole:

 The first two foreach statements assignment statements should be
 compile errors.
 I'm actually a little bit surprised that we don't already test for this.
 But I spose that would actually be quite hard.

I don't know how hard it is. One purpose of this post is to ask how much hard that is. Bye, bearophile

I know, this is just my thought on it though.
May 31 2014
prev sibling next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
31-May-2014 14:56, bearophile пишет:
 My opinions about D array bound checks are slowly changing. I still
 think "-boundscheck=off" is useful and good to have. But now I am giving
 increasing importance to compiler logic that optimizes away bound checks
 safely.

Cool, I hope it means you are getting ready to try your hand at implementing it!
 People more and more want a safe system language, more and more
 persons don't disable array bound tests. This means that optimizing away
 bound checks is becoming more and more important in D.

All kind of a non-argument. Who are those people that are not disabling bounds checks in system language ? :)
 And D can't
 ignore this need any more.

Surely it can be safely ignored. It's just a nice to have feature. Say in 2 years from now it may be closer to the top priority, but there are far, far more important problem to solve. Not that I'm going to discourage anybody working on it.
 Even adding logic to remove 20% of the bound
 checks in numeric code is going to help D because I think more and more
 people will not disable bound checks in D.

-- Dmitry Olshansky
May 31 2014
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
01-Jun-2014 14:21, bearophile пишет:
 Dmitry Olshansky:

 Who are those people that are not disabling bounds checks in system
 language ? :)

If you show D code on Reddit and you show to compile with -noboundscheck you hear some people growl. I don't remember this happening much in past. So I think the attitude toward disabling bound checks is changing. ----------- Once some more logic for bound checks removal is in D, it can even be added a bounded expression attribute:

An expression attribute? Why turn language into a mess over this tiny problem? It seems to me that you are considering solutions that add arbitrary amounts of complexity to solve relatively small problems.
 foreach (immutable i; 0 .. a.length)
      a[ bounded i]++;

  bounded {
      foreach (immutable i; 0 .. a.length)
          a[ bounded i]++;
 }

 The purpose of  bounded is just to give a compile-time error if the
 compiler is not able to remove one (or more if used with the {} syntax)
 array bound check.

That "just" epithet is remarkable self-destruction. -- Dmitry Olshansky
Jun 01 2014
parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
02-Jun-2014 13:46, bearophile пишет:
 Dmitry Olshansky:

 An expression attribute? Why turn language into a mess over this tiny
 problem?

 It seems to me that you are considering solutions that add arbitrary
 amounts of complexity to solve relatively small problems.

 ...
 That "just" epithet is remarkable self-destruction.

Please be more gentle toward others. Your attitude is poisonous for creativity, and while I have a thick hide and I can ignore comments like this, similar answers could scare away other people from contributing to the discussions. The lack of tact and basic human kindness is a common problem in technical forum. I sometimes do the same mistake, but we should try to be better. Your good work on the regex module shows you are intelligent and qualified, so I'm sure you can also be better toward others.

Yes, I was on the edge. My apologies. The point that I agree may be lost due to the emotional amplitude was that ideas became that much more interesting by keeping "entities introduced vs problems solved" count as low as possible.
 Regarding the technical topic, array bound checks are not a tiny
 problem, and it's not a huge one. D has switches and logic to enable and
 disable them. The Oracle JavaVM ha a good amount of logic to optimize
 away array bound checks. Similar logic is visible in other language
 compilers. Ada language has refined features that allow the compiler to
 safely remove some bound checks without too much inference work. There
 are several papers on this topic, like the one I've shown in the first
 post, that show that several intelligent people have studied this topic
 extensively. Experimental high-performance numeric languages like X10
 and Chapel contain several strategies to avoid array bound checks
 safely. The experience with both famous bugs and the daily experience in
 debugging programs shows that out-of-bound array access is a frequent
 source of some of the worst bugs.

It would be interesting if you could point to a precedent of expression-level attribute used for enforcing that compiler does elide bounds checking or any other features. The fact that bounds checks can be elided as part of optimization is not new.
 There are papers that show a 20-100%
 improvement in performance coming from disabling array bound checks in
 programs that use arrays a lot, like most scientific programs. And D
 language seems fit for some heavy numerical work. D Zen considers quite
 important both performance and safety, so adding logic to the compiler
 to remove some more array bounds is a very good way to do both things at
 once.

No arguing that, the question about implementing the logic mostly boils down to who would be carrying the torch (doing the work on the compiler). Somewhat joking, giving the prologue of your proposal:
 ... But now I am giving increasing importance to compiler logic that 

I hoped you'd want to implement it. The point about extra attributes to enforce this optimization is that it would a very tough sell.
 Regarding my idea of the  bounded, perhaps it's stupid and useless, but
 you have to understand that it's just an idea. Most ideas are not even
 wrong, but sometimes the wrong ones lead to better ideas, that sometimes
 are useful. This has happened many times even in the D history (like my
 refused suggestion for the  noheap attribute. Now we have  nogc and I
 like it a lot, it allows me to understand better what my code is doing).

I suspect that most folks arguing for nogc never heard of noheap, and that's the problem with generating ideas for the sake of throwing them out there. Ideas as they stand are cheap, it's refined ideas that are priceless. Yes, a weak idea can turn out to be useful, after a lot of work was spent on refining it.
 If you criticize too much the people that invent ideas, you will have no
 future.

It goes both ways. Accepting everything is the definition of disaster.
 Bye,
 bearophile

-- Dmitry Olshansky
Jun 02 2014
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Rikki Cattermole:

 The first two foreach statements assignment statements should 
 be compile errors.
 I'm actually a little bit surprised that we don't already test 
 for this.
 But I spose that would actually be quite hard.

I don't know how hard it is. One purpose of this post is to ask how much hard that is. Bye, bearophile
May 31 2014
prev sibling next sibling parent "Wanderer" <no-reply no-reply.org> writes:
 void main() {
     int[5] data;
     foreach (const i; 0 .. 10)
         data[i] = 0;
     foreach (immutable i; 0 .. 10)
         data[i] = 0;
     int[10] big;
     foreach (const i, x; big)
         data[i] = x;
 }

I'm not sure if bound checks should be removed here. Before removal, this code gives safe runtime exception, or, as suggested above, compile-time error. After removal, this code might cause access violation - which, unlike runtime exception, would leave program/kernel in corrupted state.
 But the compiler must recognize this as correct code:


 void main() {
     int[5] data;
     foreach (const i; 0 .. 10)
         if (i < 5)
             data[i] = 0;
 }

My personal opinion is that code like this should remain inefficient, to stimulate programmers to use simpler, easier to understand idioms, like foreach (i; data). If bound checks get removed in this case, that already covers 90% of loops under question. :-)
May 31 2014
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Wanderer:

 void main() {
    int[5] data;
    foreach (const i; 0 .. 10)
        data[i] = 0;
    foreach (immutable i; 0 .. 10)
        data[i] = 0;
    int[10] big;
    foreach (const i, x; big)
        data[i] = x;
 }

I'm not sure if bound checks should be removed here. Before removal, this code gives safe runtime exception, or, as suggested above, compile-time error.

The compiler should catch all those cases at compile-time :-)
 But the compiler must recognize this as correct code:


 void main() {
    int[5] data;
    foreach (const i; 0 .. 10)
        if (i < 5)
            data[i] = 0;
 }

My personal opinion is that code like this should remain inefficient, to stimulate programmers to use simpler, easier to understand idioms, like foreach (i; data).

Java already removes several bound checks but this only increases the array access performance, so it's not easy to see, unless you do benchmarks (or in D compare the run-time with the run-time with disabled array bound tests). So I don't think this compiler improvement is going to worsen the D code you will see around :-) Bye, bearophile
May 31 2014
prev sibling next sibling parent reply =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:
 bound tests. This means that optimizing away bound checks is 
 becoming more and more important in D. And D can't ignore this

Expressions like x[0 .. $/n] and x[$/n .. $] are important in many divide and conquer (recursive) algorithms such as quick-sort and shall not need bounds check simply because 0 <= $/n <= $, when n >=1 I've looked around in DMD for a suitable place to add checks for this but haven't found the spot where range-check is injected. Help anyone?
May 31 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/31/2014 4:06 PM, "Nordlöw" wrote:
 I've looked around in DMD for a suitable place to add checks for this but
 haven't found the spot where range-check is injected. Help anyone?

There isn't a suitable place. To make it work, data flow analysis would have to be added to the front end. While doable, this is not a simple addition. Eventually, we'll have to do it as a lot of things become possible & better with data flow analysis, not just bounds check elimination.
May 31 2014
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 5/31/2014 5:02 PM, "Nordlöw" wrote:
 Could you elaborate a bit on what data flow optimizations mean?

http://en.wikipedia.org/wiki/Data-flow_analysis
 What other kinds of optimizations will become possible?

Escape analysis, for one. http://en.wikipedia.org/wiki/Escape_analysis
 Is this something that other langs/compilers offer?

All modern compilers use DFA in the optimizer pass (including dmd), but that is not set up to provide information back to the front end.
May 31 2014
prev sibling next sibling parent "Daniel Murphy" <yebbliesnospam gmail.com> writes:
"Meta"  wrote in message news:pogogtdjyetukennyras forum.dlang.org...

 I've always wondered if VRP can be leveraged in certain situations. I 
 can't remember exactly how it's supposed to work, but very basically, 
 isn't it just numeric variables (and expressions?) having an associated 
 range that they carry around with them at compile time, so something like 
 this is possible:

 long n1 = long.max;
 int n2 = n1 % 3; //No cast needed due to VRP

 Couldn't this be used for other things as well, such as detecting numeric 
 overflow at compile time, or like Nordlow suggested, figuring out when 
 it's safe to elide an array bounds check?

It can, and it already is. The problem is that n1 above is not guaranteed to _stay_ equal to long.max. Without data flow analysis the compiler doesn't know that it is never re-assigned, so the possible range is any value that fits in a long. There are cases where it should be able to tell without data flow analysis but are currently not implemented.
May 31 2014
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/31/14, 6:47 PM, Meta wrote:
 On Saturday, 31 May 2014 at 23:30:41 UTC, Walter Bright wrote:
 There isn't a suitable place. To make it work, data flow analysis
 would have to be added to the front end. While doable, this is not a
 simple addition. Eventually, we'll have to do it as a lot of things
 become possible & better with data flow analysis, not just bounds
 check elimination.

I've always wondered if VRP can be leveraged in certain situations. I can't remember exactly how it's supposed to work, but very basically, isn't it just numeric variables (and expressions?) having an associated range that they carry around with them at compile time, so something like this is possible: long n1 = long.max; int n2 = n1 % 3; //No cast needed due to VRP Couldn't this be used for other things as well, such as detecting numeric overflow at compile time, or like Nordlow suggested, figuring out when it's safe to elide an array bounds check?

For $/n VRP can't be used because it's geared toward constants, not relative to variables. So VRP could inform of something like "this number is between 0 and 5", not "this number is between 0 and a.length". -- Andrei
Jun 01 2014
prev sibling next sibling parent =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:
 as a lot of things become possible & better with data flow 
 analysis, not just bounds check elimination.

Great. Now I know. Btw: Could you elaborate a bit on what data flow optimizations mean? What other kinds of optimizations will become possible? Is this something that other langs/compilers offer?
May 31 2014
prev sibling next sibling parent "Meta" <jared771 gmail.com> writes:
On Saturday, 31 May 2014 at 23:30:41 UTC, Walter Bright wrote:
 There isn't a suitable place. To make it work, data flow 
 analysis would have to be added to the front end. While doable, 
 this is not a simple addition. Eventually, we'll have to do it 
 as a lot of things become possible & better with data flow 
 analysis, not just bounds check elimination.

I've always wondered if VRP can be leveraged in certain situations. I can't remember exactly how it's supposed to work, but very basically, isn't it just numeric variables (and expressions?) having an associated range that they carry around with them at compile time, so something like this is possible: long n1 = long.max; int n2 = n1 % 3; //No cast needed due to VRP Couldn't this be used for other things as well, such as detecting numeric overflow at compile time, or like Nordlow suggested, figuring out when it's safe to elide an array bounds check?
May 31 2014
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Daniel Murphy:

 There are cases where it should be able to tell without data 
 flow analysis but are currently not implemented.

Such cases are not rare: https://issues.dlang.org/show_bug.cgi?id=10594 https://issues.dlang.org/show_bug.cgi?id=10615 https://issues.dlang.org/show_bug.cgi?id=10749 https://issues.dlang.org/show_bug.cgi?id=10751 (There is one more of similar suggestions). In general in your D code use immutable/const everywhere you can. Bye, bearophile
May 31 2014
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Dmitry Olshansky:

 Who are those people that are not disabling bounds checks in 
 system language ? :)

If you show D code on Reddit and you show to compile with -noboundscheck you hear some people growl. I don't remember this happening much in past. So I think the attitude toward disabling bound checks is changing. ----------- Once some more logic for bound checks removal is in D, it can even be added a bounded expression attribute: foreach (immutable i; 0 .. a.length) a[ bounded i]++; bounded { foreach (immutable i; 0 .. a.length) a[ bounded i]++; } The purpose of bounded is just to give a compile-time error if the compiler is not able to remove one (or more if used with the {} syntax) array bound check. Bye, bearophile
Jun 01 2014
prev sibling next sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
On Saturday, 31 May 2014 at 10:56:06 UTC, bearophile wrote:
 void main() {
     int[5] data;
     foreach (const i; 0 .. 10)
         data[i] = 0;
     foreach (immutable i; 0 .. 10)
         data[i] = 0;
     int[10] big;
     foreach (const i, x; big)
         data[i] = x;
 }


 But the compiler must recognize this as correct code:


 void main() {
     int[5] data;
     foreach (const i; 0 .. 10)
         if (i < 5)
             data[i] = 0;
 }


 Can this logic be added to D?


 More info on the topic:
 http://en.wikipedia.org/wiki/Bounds-checking_elimination
 http://ssw.jku.at/Research/Papers/Wuerthinger07/Wuerthinger07.pdf

 Bye,
 bearophile

What do GDC or LDC generate for these sample code with optimizations on ?
Jun 01 2014
parent Walter Bright <newshound2 digitalmars.com> writes:
On 6/2/2014 4:40 AM, bearophile wrote:
 I don't know what to report, it just crashes, with no error messages.

Report the source code you fed to it that caused the crash.
Jun 02 2014
prev sibling next sibling parent =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:
 For $/n VRP can't be used because it's geared toward constants, 
 not relative to variables. So VRP could inform of something 
 like "this number is between 0 and 5", not "this number is 
 between 0 and a.length". -- Andrei

That's what I thought of aswell. Are there any plans to formalize the structures and algorithms of such expressions and their propagations? I guess LLVM or GCC must have thought of these things, right?
Jun 01 2014
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
deadalnix:

 What do GDC or LDC generate for these sample code with 
 optimizations on ?

This is not an interesting question because those two programs are meant as parts of larger programs. ldc2 optimizes away both programs to "xorl %eax, %eax". And I can't test on GDC because GDC compiler crashes on my system since years. Bye, bearophile
Jun 02 2014
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Dmitry Olshansky:

 An expression attribute? Why turn language into a mess over 
 this tiny problem?

 It seems to me that you are considering solutions that add 
 arbitrary amounts of complexity to solve relatively small 
 problems.

 ...
 That "just" epithet is remarkable self-destruction.

Please be more gentle toward others. Your attitude is poisonous for creativity, and while I have a thick hide and I can ignore comments like this, similar answers could scare away other people from contributing to the discussions. The lack of tact and basic human kindness is a common problem in technical forum. I sometimes do the same mistake, but we should try to be better. Your good work on the regex module shows you are intelligent and qualified, so I'm sure you can also be better toward others. Regarding the technical topic, array bound checks are not a tiny problem, and it's not a huge one. D has switches and logic to enable and disable them. The Oracle JavaVM ha a good amount of logic to optimize away array bound checks. Similar logic is visible in other language compilers. Ada language has refined features that allow the compiler to safely remove some bound checks without too much inference work. There are several papers on this topic, like the one I've shown in the first post, that show that several intelligent people have studied this topic extensively. Experimental high-performance numeric languages like X10 and Chapel contain several strategies to avoid array bound checks safely. The experience with both famous bugs and the daily experience in debugging programs shows that out-of-bound array access is a frequent source of some of the worst bugs. There are papers that show a 20-100% improvement in performance coming from disabling array bound checks in programs that use arrays a lot, like most scientific programs. And D language seems fit for some heavy numerical work. D Zen considers quite important both performance and safety, so adding logic to the compiler to remove some more array bounds is a very good way to do both things at once. Regarding my idea of the bounded, perhaps it's stupid and useless, but you have to understand that it's just an idea. Most ideas are not even wrong, but sometimes the wrong ones lead to better ideas, that sometimes are useful. This has happened many times even in the D history (like my refused suggestion for the noheap attribute. Now we have nogc and I like it a lot, it allows me to understand better what my code is doing). If you criticize too much the people that invent ideas, you will have no future. Bye, bearophile
Jun 02 2014
prev sibling next sibling parent Iain Buclaw via Digitalmars-d <digitalmars-d puremagic.com> writes:
On 2 June 2014 10:24, bearophile via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 deadalnix:


 What do GDC or LDC generate for these sample code with optimizations on ?

This is not an interesting question because those two programs are meant as parts of larger programs. ldc2 optimizes away both programs to "xorl %eax, %eax". And I can't test on GDC because GDC compiler crashes on my system since years.

Then 1) Get a newer version of GDC 2) Raise bugs - you do this for DMD. Why not GDC?
Jun 02 2014
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Iain Buclaw:

 1) Get a newer version of GDC
 2) Raise bugs - you do this for DMD.  Why not GDC?

I don't know what to report, it just crashes, with no error messages. Bye, bearophile
Jun 02 2014
prev sibling next sibling parent Iain Buclaw via Digitalmars-d <digitalmars-d puremagic.com> writes:
On 2 June 2014 12:40, bearophile via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 Iain Buclaw:


 1) Get a newer version of GDC
 2) Raise bugs - you do this for DMD.  Why not GDC?

I don't know what to report, it just crashes, with no error messages. Bye, bearophile

That doesn't sound right. Where does it crash? - Compile-time? Crashes *always* have backtraces. - Runtime? Reduce the program down by hand or using dustmite and send bug to the effect of: Runtime SEGV doing XXX
Jun 02 2014
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Walter Bright:

 Report the source code you fed to it that caused the crash.

Even hello world crashes. Bye, bearophile
Jun 02 2014
prev sibling next sibling parent Iain Buclaw via Digitalmars-d <digitalmars-d puremagic.com> writes:
On 2 June 2014 17:33, bearophile via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 Walter Bright:


 Report the source code you fed to it that caused the crash.

Even hello world crashes. Bye, bearophile

Then that is a start. Post a bug and report clearly what your environment is.
Jun 02 2014
prev sibling next sibling parent =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:
 Please be more gentle toward others. Your attitude is poisonous 
 for creativity, and while I have a thick hide and I can ignore

Yes, please! Be good to your fellow D coders!
Jun 02 2014
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Monday, 2 June 2014 at 09:24:38 UTC, bearophile wrote:
 deadalnix:

 What do GDC or LDC generate for these sample code with 
 optimizations on ?

This is not an interesting question because those two programs are meant as parts of larger programs. ldc2 optimizes away both programs to "xorl %eax, %eax". And I can't test on GDC because GDC compiler crashes on my system since years. Bye, bearophile

I think we should focus on solving problems that modern backend aren't capable to optimize. If they are able, then we should focus on other thing, or identify the cases where they are unable to do so, figure out why, and find a solution (maybe improving existing optimizers are the road, maybe improving the frontend to feed more infos to the optimizer is the way to go, maybe something else...).
Jun 02 2014
prev sibling next sibling parent Iain Buclaw via Digitalmars-d <digitalmars-d puremagic.com> writes:
--001a113a6372d0028d04fae816ef
Content-Type: text/plain; charset=UTF-8

On 2 Jun 2014 17:49, "Iain Buclaw" <ibuclaw gdcproject.org> wrote:
 On 2 June 2014 17:33, bearophile via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 Walter Bright:


 Report the source code you fed to it that caused the crash.

Even hello world crashes. Bye, bearophile

Then that is a start. Post a bug and report clearly what your

Also where you got gdc from if you downloaded binaries instead of built from development. --001a113a6372d0028d04fae816ef Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable <p dir=3D"ltr"><br> On 2 Jun 2014 17:49, &quot;Iain Buclaw&quot; &lt;<a href=3D"mailto:ibuclaw = gdcproject.org">ibuclaw gdcproject.org</a>&gt; wrote:<br> &gt;<br> &gt; On 2 June 2014 17:33, bearophile via Digitalmars-d<br> &gt; &lt;<a href=3D"mailto:digitalmars-d puremagic.com">digitalmars-d purem= agic.com</a>&gt; wrote:<br> &gt; &gt; Walter Bright:<br> &gt; &gt;<br> &gt; &gt;<br> &gt; &gt;&gt; Report the source code you fed to it that caused the crash.<b= r> &gt; &gt;<br> &gt; &gt;<br> &gt; &gt; Even hello world crashes.<br> &gt; &gt;<br> &gt; &gt; Bye,<br> &gt; &gt; bearophile<br> &gt;<br> &gt;<br> &gt; Then that is a start. =C2=A0Post a bug and report clearly what your en= vironment is.</p> <p dir=3D"ltr">Also where you got gdc from if you downloaded binaries inste= ad of built from development.</p> --001a113a6372d0028d04fae816ef--
Jun 02 2014
prev sibling next sibling parent "Kagamin" <spam here.lot> writes:
On Saturday, 31 May 2014 at 10:56:06 UTC, bearophile wrote:
 Even adding logic to remove 20% of the bound checks in numeric 
 code is going to help D because I think more and more people 
 will not disable bound checks in D.

What speedup those 20% will give? 3%? Shouldn't optimization go a different route? 1. Get annoying performance problem. 2. Diagnose it. 3. Optimize the hot spot. Do you have 1?
Jun 02 2014
prev sibling next sibling parent "Kagamin" <spam here.lot> writes:
On Monday, 2 June 2014 at 09:46:03 UTC, bearophile wrote:
 There are papers that show a 20-100% improvement in performance 
 coming from disabling array bound checks in programs that use 
 arrays a lot, like most scientific programs. And D language 
 seems fit for some heavy numerical work.

Scientific programs usually process trusted data (or easily validated), so they may need correctness checks, but don't need security checks. If you see the algorithm works with bound checks, you can turn them off.
Jun 03 2014
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Kagamin:

 Scientific programs usually process trusted data (or easily 
 validated), so they may need correctness checks, but don't need 
 security checks.

I agree.
 If you see the algorithm works with bound checks, you can turn 
 them off.

Algorithms go in different code paths, so different runs hit the arrays differently. In scientific code you have to trust the correctness of the results. So you prefer to leave array bound checks active (as in Java, Julia, Python). If your compiler is able to remove some bound checks and mechanically verify the code as safe, that's even better (as in Java, and probably in future Julia). If you give me a compiler able to remove most array bound checks safely, you will see me never disable them blindly again :-) Bye, bearophile
Jun 03 2014
prev sibling next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 5/31/2014 3:56 AM, bearophile wrote:
 Even adding logic to remove 20% of the bound checks in numeric code is going
to help D

https://github.com/D-Programming-Language/dmd/pull/3620
Jun 05 2014
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Walter Bright:

 https://github.com/D-Programming-Language/dmd/pull/3620

Yes, it's a start point :-) Bye, bearophile
Jun 05 2014
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Sorry for my slow answering, I'm trying to catch up.

Kagamin:

 Shouldn't optimization go a different route?
 
 1. Get annoying performance problem.
 2. Diagnose it.
 3. Optimize the hot spot.
 
 Do you have 1?

That's a good strategy if you are optimizing user code. But even when you write library code you sometimes can't use that strategy, because you are not always sure the performance needs of the people that will use the library. That's why Phobos should be written to be efficient regardless of evidence of performance problems. The same is true for compiler writers. If you compile D code that uses arrays a lot with and without "-noboundscheck" you see some run time difference. It's nice to think that the D compiler will remove part of such difference in all your future D programs that you have not yet written. Array bound checks removal was found to be a sufficiently important problem to solve even in Java, that is used for heavy array processing less than other languages like Fortran. Bye, bearophile
Jun 05 2014
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
deadalnix:

I think we should focus on solving problems that modern backend 
aren't capable to optimize.<

I agree. But those D snippets I have written are not able to show what the backends are or aren't able to do. Generally if you compile D code even with LDC2 you see a significant performance difference in heavy-array processing code if you compile it with or without -noboundscheck. I have seen this plenty of times. If you want we can take a look at some benchnmarks. Bye, bearophile
Jun 05 2014
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Dmitry Olshansky:

 It would be interesting if you could point to a precedent of 
 expression-level attribute used for enforcing that compiler 
 does elide bounds checking

Perhaps that's a little invention of mine :-) In the last years I've seen that while optimizations are important, there are situations where you need to know if an optimization is done. Array bound checks removal is not able to change the code semantics like tail call optimization (TCO), but like forced inlining you sometimes want to be sure a small amount of lines of a numeric processing kernel doesn't waste run time verifying bounds. (And if you are sure certain bound checks are not present, you have also verified that a part of the code doesn't raise array bound run-time errors. So it's also a code verification technique, that I think will become more common in the next years). If you write a contract between the compiler and the programmer, and it fails (so the compiler is not able to remove all bound checks in a piece of D code inside the bounded { ... }), then the programmer can add strongly typed indexes to help the compiler figure out at compile time the correctness of array accesses (strongly typed array indexes that I have discussed in a recent thread are indeed also useful for the compiler optimizations, they are not just to help avoid programmers bugs), or the programmer can add some asserts or change the code in other small ways to reach the same goal. Once such goal is reached, and your kernel computation is efficient, you don't care if in some cases in the rest of the code the D compiler is not able to remove all array bound checks. So only a small/certain percentage of the code is meant to go inside the braces of bounded{...}. The alternative solution is to put the kernel into another module, and compile it separately with "-boundscheck=off". But this is less handy and it's less safe. Generally I like ways to express a richer semantics in the code. Bye, bearophile
Jun 05 2014
prev sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Thursday, 5 June 2014 at 09:36:53 UTC, bearophile wrote:
 deadalnix:

I think we should focus on solving problems that modern backend 
aren't capable to optimize.<

I agree. But those D snippets I have written are not able to show what the backends are or aren't able to do. Generally if you compile D code even with LDC2 you see a significant performance difference in heavy-array processing code if you compile it with or without -noboundscheck. I have seen this plenty of times. If you want we can take a look at some benchnmarks.

I know and this is worthwhile to add some effort to optimize that. I was simply reminding that we should try to understand in which case the code is not optimized and why, before jumping to solutions.
Jun 05 2014