digitalmars.D - Software Assurance Reference Dataset

Walter Bright (6/6) Jun 25 2014 http://samate.nist.gov/SRD/view.php?count=20&first=0&sort=asc

bearophile (9/13) Jun 26 2014 Even if D has wide success, I don't think D will delete all the C
Paulo Pinto (6/12) Jun 26 2014 As expected, most of them would be automatically fixed by using a
bearophile (29/32) Jun 26 2014 One kind of problem left is to avoid stack overflows. I have had

Walter Bright (7/8) Jun 26 2014 In general, stack overflow checking at compile time is the halting probl...

bearophile (12/16) Jun 26 2014 There are several systems, including SPARK, that perform a

Walter Bright (7/20) Jun 26 2014 Spark is a research language that does not work, as I've discovered and

Araq (8/12) Jun 26 2014 What?! It's easily solvable: Forbid recursion and indirect

Walter Bright (14/28) Jun 26 2014 Oh well, there goes about 90% of D programs and pretty much all use of t...
Andrei Alexandrescu (5/12) Jun 26 2014 I do agree that a useful subset of a language can be conservatively

H. S. Teoh via Digitalmars-d (13/26) Jun 26 2014 [...]

ponce (2/14) Jun 29 2014 Actually CUDA do have recursion, and indirect function calls.

Paulo Pinto (5/18) Jun 29 2014 Not only that, CUDA offers a quite usable C++ subset whereas OpenCL

bearophile (22/28) Jul 10 2014 The idea comes from Erlang language (and perhaps Erlang has coped

Walter Bright (2/26) Jul 10 2014 Thanks for the links!

Timon Gehr (6/10) Jun 26 2014 That is irrelevant to his point because he is not suggesting to solve

Walter Bright (5/15) Jun 26 2014 When you're dealing with security issues, which is what this about, you'...

Timon Gehr (17/35) Jun 26 2014 This is about _avoiding stack overflows_. It's written down literally in...

Andrei Alexandrescu (10/16) Jun 26 2014 If a function can be annotated with what other functions it calls

Timon Gehr (7/17) Jun 26 2014 I don't think it makes sense to imply that I am defending an extreme

Andrei Alexandrescu (5/23) Jun 26 2014 That's not a fair characterization. Interprocedural analysis is quite a

Andrew Godfrey (21/23) Jul 22 2014 I think that's true but you don't necessarily have to annotate

H. S. Teoh via Digitalmars-d (42/61) Jun 26 2014 [...]

Walter Bright (19/40) Jun 26 2014 Check the title of this thread, the linked issues, and bearophile's comm...

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (11/15) Jun 29 2014 No, "is N bytes of stack is sufficient" is decidable if you don't
"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (17/21) Jun 30 2014 But a page is only 2K? So what happens if you skip more than 2K

bearophile (41/66) Jul 10 2014 (Sorry for the very delayed answers.)
deadalnix (5/14) Jul 11 2014 The compiler can ensure that you hit at least every 4k or so. It

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (9/12) Jul 12 2014 Sure, it should be possible to work it into the backend at the
Walter Bright (2/3) Jul 12 2014 And it already does.

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (36/39) Jul 13 2014 Doesn't look so:

Walter Bright (19/30) Jul 13 2014 On Win32:

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (5/37) Jul 14 2014 Hmm... but this using DMD, not GDC. Or do you mean that DMD

Walter Bright (5/9) Jul 14 2014 There are often undocumented reasons why gcc generates code in certain w...

Andrew Godfrey (16/16) Jul 19 2014 Returning to the earlier question, of helping to avoid stack

bearophile (10/17) Jul 19 2014 A @tailrec annotation seems good, and will improve the functional

bearophile (4/17) Jul 19 2014 Perhaps a @cps (or @continuation) annotation is better and more
Walter Bright (3/7) Jul 19 2014 If you want to guarantee replacement of a recursive call with a loop, ju...

bearophile (8/10) Jul 19 2014 There are cases where a recursive algorithm is nicer. And people

Walter Bright (5/12) Jul 20 2014 I doubt they'll want to use that attribute, either.

bearophile (19/26) Jul 20 2014 In Scala there is "@tailrec":
deadalnix (2/5) Jul 21 2014 Especially if that can be done with AST macros :D

Dmitry Olshansky (9/19) Jul 20 2014 Functional programming is full of simple recursion and it would be nice

Walter Bright (12/17) Jul 20 2014 Traditional FP languages don't have loops, and so must do recursion. D h...

Walter Bright (4/5) Jul 20 2014 Amazingly, these suddenly appeared:

bearophile (6/8) Jul 20 2014 Are such optimizations portable and guaranteed on all D

Walter Bright (3/5) Jul 20 2014 C'mon, bearophile. Optimizations are always implementation dependent. Pe...

Dmitry Olshansky (21/35) Jul 20 2014 Actually that blog entry is wrong about it. Computed goto won't help

Walter Bright (3/18) Jul 20 2014 I'm pretty reluctant to add language features that can be done as optimi...

Dmitry Olshansky (12/41) Jul 23 2014 I found a way but that relies on tail-call optimization, otherwise it

Walter Bright (3/6) Jul 23 2014 -release, -inline, -O and -g are all separate flags for good reasons. Yo...

Dmitry Olshansky (5/13) Jul 23 2014 Unless you are library writer, and then it's user who mixes and matches

Walter Bright (3/5) Jul 23 2014 I think it's a generally unworkable rabbit hole if we add language featu...

Dmitry Olshansky (8/14) Jul 23 2014 That's true. I'm thinking of how to solve things with minimal extension

bearophile (5/6) Jul 23 2014 It's a niche that system languages are the few fit for. Even

Dmitry Olshansky (8/12) Jul 23 2014 All of fast portable interpreters i.e. not JITs.

Walter Bright (3/5) Jul 23 2014 Yes. Feel free to ping me if I forget about them.

Andrei Alexandrescu (2/7) Jul 23 2014 bugzilla -- Andrei

Walter Bright (5/13) Jul 23 2014 They're already there:

Andrei Alexandrescu (2/10) Jul 23 2014 ... and -noboundscheck. -- Andrei

Timon Gehr (14/22) Jul 20 2014 So does Haskell.

Andrei Alexandrescu (2/22) Jul 20 2014 And that doesn't look awkward at all :o). -- Andrei

Timon Gehr (4/32) Jul 21 2014 Indeed.

Andrei Alexandrescu (2/35) Jul 22 2014 Code sucks. It's its own destruction. -- Andrei

Timon Gehr (3/30) Jul 22 2014 It is easy to understand and works correctly, but of course this is not

bearophile (4/5) Jul 22 2014 It's easy to understand because you are very intelligent Timon :-)

bearophile (5/6) Jul 21 2014 A related thread:

Walter Bright (2/13) Jul 20 2014 Uh-oh! Monad! :-)

Timon Gehr (3/20) Jul 21 2014 The example just uses the ST Monad which is quite similar to weakly pure...

Walter Bright (2/4) Jul 21 2014 D doesn't have weakly pure statements - it has weakly pure functions.

Timon Gehr (3/7) Jul 22 2014 The type checker distinguishes between statements that are allowed in

Walter Bright (9/19) Jul 22 2014 Would you rather write that or:

bearophile (15/16) Jul 22 2014 How much experience do you have in writing Haskell programs

Walter Bright (8/21) Jul 22 2014 Does that imply you agree with my earlier point, then, that D doesn't ne...

Timon Gehr (7/16) Jul 22 2014 Tail call support is still useful. Looping is not the main reason for

Timon Gehr (2/3) Jul 22 2014 (In D!)
Walter Bright (6/13) Jul 22 2014 Yes. If you want to infer criticism of Haskell from that, go ahead, but ...

Timon Gehr (7/16) Jul 22 2014 Being pedantic is important enough.

Brian Rogoff (7/16) Jul 22 2014 Indeed, I recall Matthias Felleisen wrote that OOP makes no sense

H. S. Teoh via Digitalmars-d (20/45) Jul 22 2014 I'd rather write:

deadalnix (3/8) Jul 22 2014 This is how we fucked UFCS.

Timon Gehr (28/50) Jul 22 2014 If I fix the typo, this computes the same numbers for inputs int.min up

bearophile (10/12) Jul 22 2014 Look in RosettaCode for various implementations in D:

Walter Bright (2/6) Jul 22 2014 I.e. the perf difference has nothing to do with Haskell or tail recursio...

Walter Bright (3/6) Jul 22 2014 Irrelevant to the looping issue - comparing the code gen of D's BigInt t...

Andrew Godfrey (54/61) Jul 20 2014 Sorry, by "re-entrancy" I just meant a statement

=?UTF-8?Q?Tobias=20M=C3=BCller?= (8/13) Jul 21 2014 Wouldn't it be more useful to have a modified/annotated return statement

bearophile (10/17) Jul 21 2014 Yes, in theory a @tailrec annotation doesn't change the mangling

Timon Gehr (5/11) Jul 21 2014 (Continuations and a tail calls are not the same kind of thing.)
Andrew Godfrey (4/8) Jul 22 2014 I see what you're saying, and it works for the quick sort

Iain Buclaw via Digitalmars-d (9/56) Jul 14 2014 For GDC, there is -fstack-protector (which is turned on by default in

deadalnix (6/18) Jun 26 2014 it is @system

Walter Bright <newshound2 digitalmars.com> writes:

http://samate.nist.gov/SRD/view.php?count=20&first=0&sort=asc

This is a list of security vulnerabilities in languages including C/C++. 88,737 
of them (!).

It's an interesting list, and an opportunity for D. I once said that my job was 
to put Coverity out of business. The more of these issues D can automatically 
prevent with  safe, the better.

Jun 25 2014

"bearophile" <bearophileHUGS lycos.com> writes:

Walter Bright:

 It's an interesting list, and an opportunity for D. I once said 
 that my job was to put Coverity out of business.

Even if D has wide success, I don't think D will delete all the C 
and C++ code out of existence, so I don't think D will ever put 
Coverity out of business :-)


 The more of these issues D can automatically prevent
 with  safe, the better.

I think D will need/better guns for that, like a more principled 
(formalized, written fully down in specs, eventually even proved) 
management of uniqueness, etc.

Bye,
bearophile

Jun 26 2014

"Paulo Pinto" <pjmlp progtools.org> writes:

On Thursday, 26 June 2014 at 00:36:58 UTC, Walter Bright wrote:
 http://samate.nist.gov/SRD/view.php?count=20&first=0&sort=asc

 This is a list of security vulnerabilities in languages 
 including C/C++. 88,737 of them (!).

 It's an interesting list, and an opportunity for D. I once said 
 that my job was to put Coverity out of business. The more of 
 these issues D can automatically prevent with  safe, the better.

As expected, most of them would be automatically fixed by using a 
descendent from Algol/Mesa/Cedar branch of languages.

I look forward that  safe can provide a similar set of guarantees.

--
Paulo

Jun 26 2014

"bearophile" <bearophileHUGS lycos.com> writes:

Walter Bright:

 It's an interesting list, and an opportunity for D. I once said 
 that my job was to put Coverity out of business. The more of 
 these issues D can automatically prevent with  safe, the better.

One kind of problem left is to avoid stack overflows. I have had 
several such cases in my D code (perhaps because I use 
fixed-sized arrays a lot).

I think they can be caused by:
1) Too much large allocations in stack frames;
2) By alloca();
3) By recursion and co-recursion chains;

The first cause is probably sufficiently easy to solve 
mechanically, with a conservative call tree analysis of the code. 
This is a job for external tools.

The second cause can be solved with a formal proof of the upper 
bounds of the arguments of alloca, or even more conservatively 
disallowing alloca.

The third cause, a cycle in the call graph, can be found with an 
external too, but in theory it's also easy to avoid with a 
 no_call_cyles annotation :-) Such annotation looks useful only 
if you want to avoid stack overflows, so it's better to use a 
more general annotation that also forbids alloca(). Something 
like  fixed_stack or  constant_memory.

I think adding this annotation to D is a little overkill (despite 
SPARK and Ada analysers are doing stack size analysis since many 
years). So I think a better solution (beside creating an external 
tool) is to add to D a more general feature to attach some 
compile-time semantics to user-defined annotations. To do this 
you need a trait that returns a sequence of all the functions 
called by a function.

Bye,
bearophile

Jun 26 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 6/26/2014 2:19 AM, bearophile wrote:
 One kind of problem left is to avoid stack overflows.

In general, stack overflow checking at compile time is the halting problem. It 
needs a runtime check.

Stack overflows are not safety problems when a guard page is used past the end 
of the stack. Then, overflow checking is done in hardware. Guard pages aren't 
currently used for fibers, so overflows are a real danger there.

In 64 bit code, however, one should be able to use guard pages for fibers.

Jun 26 2014

"bearophile" <bearophileHUGS lycos.com> writes:

Walter Bright:

 In general, stack overflow checking at compile time is the 
 halting problem. It needs a runtime check.

There are several systems, including SPARK, that perform a 
conservative and apparently acceptable stack overflow check at 
compile time. If you don't agree with what I've written in my 
post, then please give a more detailed answer to the points I've 
written above.


 Stack overflows are not safety problems when a guard page is 
 used past the end of the stack.

It's not a safety problem in Erlang/Rust, because those languages 
are designed to manage such failures in a good way. In most other 
languages it's a "safety" problem, if your program execution has 
some importance.

Bye,
bearophile

Jun 26 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 6/26/2014 2:50 AM, bearophile wrote:
 Walter Bright:

 In general, stack overflow checking at compile time is the halting problem. It
 needs a runtime check.

 There are several systems, including SPARK, that perform a conservative and
 apparently acceptable stack overflow check at compile time. If you don't agree
 with what I've written in my post, then please give a more detailed answer to
 the points I've written above.

Spark is a research language that does not work, as I've discovered and 
discussed with you before. It cannot be determined the max stack usage at 
compile time, again, this is the halting problem.


 Stack overflows are not safety problems when a guard page is used past the end
 of the stack.

 It's not a safety problem in Erlang/Rust, because those languages are designed
 to manage such failures in a good way.

Please explain.


 In most other languages it's a "safety"
 problem, if your program execution has some importance.

I mean "safety" in the sense of being a security problem, which is the context 
of this thread.

Jun 26 2014

"Araq" <rumpf_a web.de> writes:

 Spark is a research language that does not work, as I've 
 discovered and discussed with you before. It cannot be 
 determined the max stack usage at compile time, again, this is 
 the halting problem.

What?! It's easily solvable: Forbid recursion and indirect
function calls and it's guaranteed that the program only requires
a fixed size stack and you can compute an upper bound of the
required stack size at compile-time. Which is BTW exactly what
OpenCL does as GPUs tend to have no stacks.

In what way is Spark a "research language that does not work"?
And how many language design issues need to be discovered until
you admit that Safe-D is a "research language that does not work"?

Jun 26 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 6/26/2014 2:01 PM, Araq wrote:
 Spark is a research language that does not work, as I've discovered and
 discussed with you before. It cannot be determined the max stack usage at
 compile time, again, this is the halting problem.

 What?! It's easily solvable: Forbid recursion and indirect
 function calls

Oh well, there goes about 90% of D programs and pretty much all use of the D 
runtime library!


 and it's guaranteed that the program only requires
 a fixed size stack and you can compute an upper bound of the
 required stack size at compile-time. Which is BTW exactly what
 OpenCL does as GPUs tend to have no stacks.

 In what way is Spark a "research language that does not work"?

A while back, bearophile posted here some advocacy that Spark was using its 
contracts to prove things about the code. I experimented with it a bit and 
discovered that such proofs did not go beyond the trivial. As I recall, 
bearophile then agreed that it was a great idea that the implementation fell
far 
short of.


 And how many language design issues need to be discovered until
 you admit that Safe-D is a "research language that does not work"?

I recommend that all such issues you discover be put on bugzilla, and marked 
with the 'safe' keyword, so they can be addressed.

I think there's a big difference between "only works for trivial cases" with no 
idea how to handle the rest and "does not work for all cases" and there being 
reasonable paths to handle them.

I admit that Safe D does not yet handle all the cases.

Jun 26 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 6/26/14, 2:01 PM, Araq wrote:
 Spark is a research language that does not work, as I've discovered
 and discussed with you before. It cannot be determined the max stack
 usage at compile time, again, this is the halting problem.

 What?! It's easily solvable: Forbid recursion and indirect
 function calls

I do agree that a useful subset of a language can be conservatively 
defined that doesn't require solving the halting problem. But that's not 
easy at all - it requires interprocedural analysis.

Andrei

Jun 26 2014

"H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:

On Thu, Jun 26, 2014 at 04:43:33PM -0700, Andrei Alexandrescu via Digitalmars-d
wrote:
 On 6/26/14, 2:01 PM, Araq wrote:
Spark is a research language that does not work, as I've discovered
and discussed with you before. It cannot be determined the max stack
usage at compile time, again, this is the halting problem.

What?! It's easily solvable: Forbid recursion and indirect
function calls

 
 I do agree that a useful subset of a language can be conservatively
 defined that doesn't require solving the halting problem. But that's
 not easy at all - it requires interprocedural analysis.

[...]

And it may not be feasible for a compiler to automatically prove.

One possible approach is to have the user supply the proof of eventual
termination, which can be mechanically verified by the compiler.
(Checking a supplied proof for correctness is more tractable than coming
up with the proof in the first place.) But to handle proofs of
non-trivial code beyond just toy examples, you might end up needing a
full-scale deductive proof subsystem in the compiler, which may or may
not be practical for D's needs!


T

-- 
That's not a bug; that's a feature!

Jun 26 2014

"ponce" <contact gam3sfrommars.fr> writes:

On Thursday, 26 June 2014 at 21:01:40 UTC, Araq wrote:
 Spark is a research language that does not work, as I've 
 discovered and discussed with you before. It cannot be 
 determined the max stack usage at compile time, again, this is 
 the halting problem.

 What?! It's easily solvable: Forbid recursion and indirect
 function calls and it's guaranteed that the program only 
 requires
 a fixed size stack and you can compute an upper bound of the
 required stack size at compile-time. Which is BTW exactly what
 OpenCL does as GPUs tend to have no stacks.

Actually CUDA do have recursion, and indirect function calls.

Jun 29 2014

Paulo Pinto <pjmlp progtools.org> writes:

Am 29.06.2014 11:07, schrieb ponce:
 On Thursday, 26 June 2014 at 21:01:40 UTC, Araq wrote:
 Spark is a research language that does not work, as I've discovered
 and discussed with you before. It cannot be determined the max stack
 usage at compile time, again, this is the halting problem.

 What?! It's easily solvable: Forbid recursion and indirect
 function calls and it's guaranteed that the program only requires
 a fixed size stack and you can compute an upper bound of the
 required stack size at compile-time. Which is BTW exactly what
 OpenCL does as GPUs tend to have no stacks.

 Actually CUDA do have recursion, and indirect function calls.

Not only that, CUDA offers a quite usable C++ subset whereas OpenCL 
keeps us in the primitive C land.

--
Paulo

Jun 29 2014

"bearophile" <bearophileHUGS lycos.com> writes:

(Sorry for the very late answer.)

Walter Bright:

 Stack overflows are not safety problems when a guard page is 
 used past the end
 of the stack.

 It's not a safety problem in Erlang/Rust, because those 
 languages are designed to manage such failures in a good way.

 Please explain.

The idea comes from Erlang language (and perhaps Erlang has coped 
it from something else), and then Rust copied it (and indeed, if 
you look at the "Influenced by" list here, you see Erlang, and it 
Rust has copied only the Erlang feature I am discussing here: 
http://en.wikipedia.org/wiki/Rust_language ).

Erlang systems must be extremely reliable, they run 
telecommunication systems that must just always work, with only 
seconds or minutes of downtime every year. But programs contains 
errors and bugs, so to face this problem Erlang (and now Rust) 
has chosen a curious strategy.

The description, see "4.3 Error handling philosophy" at page 
104-109, a PDF file:
http://www.erlang.org/download/armstrong_thesis_2003.pdf

It's also a bit explained here, at the "3. What is 
fault-tolerance" section:


Some more technical info:
http://www.erlang.org/doc/design_principles/sup_princ.html

Bye,
bearophile

Jul 10 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 7/10/2014 5:08 AM, bearophile wrote:
 (Sorry for the very late answer.)

 Walter Bright:

 Stack overflows are not safety problems when a guard page is used past the end
 of the stack.

 It's not a safety problem in Erlang/Rust, because those languages are
 designed to manage such failures in a good way.

 Please explain.

 The idea comes from Erlang language (and perhaps Erlang has coped it from
 something else), and then Rust copied it (and indeed, if you look at the
 "Influenced by" list here, you see Erlang, and it Rust has copied only the
 Erlang feature I am discussing here:
http://en.wikipedia.org/wiki/Rust_language ).

 Erlang systems must be extremely reliable, they run telecommunication systems
 that must just always work, with only seconds or minutes of downtime every
year.
 But programs contains errors and bugs, so to face this problem Erlang (and now
 Rust) has chosen a curious strategy.

 The description, see "4.3 Error handling philosophy" at page 104-109, a PDF
file:
 http://www.erlang.org/download/armstrong_thesis_2003.pdf

 It's also a bit explained here, at the "3. What is fault-tolerance" section:



 Some more technical info:
 http://www.erlang.org/doc/design_principles/sup_princ.html

 Bye,
 bearophile

Thanks for the links!

Jul 10 2014

Timon Gehr <timon.gehr gmx.ch> writes:

On 06/26/2014 11:35 AM, Walter Bright wrote:
 On 6/26/2014 2:19 AM, bearophile wrote:
 One kind of problem left is to avoid stack overflows.

 In general, stack overflow checking at compile time is the halting
 problem.

That is irrelevant to his point because he is not suggesting to solve 
the general problem precisely. Analogously: In general, checking whether 
some variable in e.g. Python is ever assigned a string value is 
undecidable as well, but this does not imply we cannot have 'int' 
variables in D.

Jun 26 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 6/26/2014 2:52 AM, Timon Gehr wrote:
 On 06/26/2014 11:35 AM, Walter Bright wrote:
 On 6/26/2014 2:19 AM, bearophile wrote:
 One kind of problem left is to avoid stack overflows.

 In general, stack overflow checking at compile time is the halting
 problem.

 That is irrelevant to his point because he is not suggesting to solve the
 general problem precisely. Analogously: In general, checking whether some
 variable in e.g. Python is ever assigned a string value is undecidable as well,
 but this does not imply we cannot have 'int' variables in D.

When you're dealing with security issues, which is what this about, you'll need 
a guarantee about stack overflow. Adding annotations is not helpful with this 
because they are not checkable.

Again, what WORKS is a runtime check.

Jun 26 2014

Timon Gehr <timon.gehr gmx.ch> writes:

On 06/26/2014 10:29 PM, Walter Bright wrote:
 On 6/26/2014 2:52 AM, Timon Gehr wrote:
 On 06/26/2014 11:35 AM, Walter Bright wrote:
 On 6/26/2014 2:19 AM, bearophile wrote:
 One kind of problem left is to avoid stack overflows.

 In general, stack overflow checking at compile time is the halting
 problem.

 That is irrelevant to his point because he is not suggesting to solve the
 general problem precisely. Analogously: In general, checking whether some
 variable in e.g. Python is ever assigned a string value is undecidable
 as well,
 but this does not imply we cannot have 'int' variables in D.

 When you're dealing with security issues, which is what this about,

This is about _avoiding stack overflows_. It's written down literally in 
the quoted passage.

 you'll need a guarantee about stack overflow. Adding annotations is not
 helpful with this because they are not checkable.
 ...

Not true. Basic facts:

- In practice, programs are constructed explicitly to fulfill a 
particular purpose and, in particular, if they do never overflow the 
stack, they usually do so for a reason.

- Reasoning can be formalized and formal proofs can be written down in 
such a way that a machine can check whether the proof is valid.

- Annotations can include a formal proof.

We've had similar discussions before. Why do we _still_ have to argue 
about the _possibility_ of having a system that is helpful for proving 
stack overflows (or other bad behaviours) absent?

You can say "out of scope" or "not a priority" or "this should be 
realized in third-party tool support" but not "it is impossible".


 Again, what WORKS is a runtime check.

A runtime check will not avoid the problem, which is exhaustion of stack 
space.

Jun 26 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 6/26/14, 4:16 PM, Timon Gehr wrote:
 - Annotations can include a formal proof.

If a function can be annotated with what other functions it calls
(non-transitively), I agree that it can be guaranteed with local
semantic checking that a program won't overflow. However requiring such
annotations would be onerous, and deducing them would require whole
program analysis.

 We've had similar discussions before. Why do we _still_ have to argue
 about the _possibility_ of having a system that is helpful for
 proving stack overflows (or other bad behaviours) absent?

 You can say "out of scope" or "not a priority" or "this should be
 realized in third-party tool support" but not "it is impossible".

I also seem to reckon Walter is in the other extreme (he asserts it 
can't be done at all, period). My understanding is that it can be done 
but only with annotations or whole program analysis.


Andrei

Jun 26 2014

Timon Gehr <timon.gehr gmx.ch> writes:

On 06/27/2014 01:47 AM, Andrei Alexandrescu wrote:
 On 6/26/14, 4:16 PM, Timon Gehr wrote:
 ...

 You can say "out of scope" or "not a priority" or "this should be
 realized in third-party tool support" but not "it is impossible".

 I also seem to reckon Walter is in the other extreme (he asserts it
 can't be done at all, period).

I don't think it makes sense to imply that I am defending an extreme 
position. I wasn't discussing design choices, truth is not a continuum 
and I was just objecting to the line of reasoning that went like 
"undecidability of the halting problem implies that formal reasoning is 
pointless.", which is clearly untrue.

 My understanding is that it can be done
 but only with annotations or whole program analysis.
 ...

Any way it is done, it doesn't come for free.

Jun 26 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 6/26/14, 5:29 PM, Timon Gehr wrote:
 On 06/27/2014 01:47 AM, Andrei Alexandrescu wrote:
 On 6/26/14, 4:16 PM, Timon Gehr wrote:
 ...

 You can say "out of scope" or "not a priority" or "this should be
 realized in third-party tool support" but not "it is impossible".

 I also seem to reckon Walter is in the other extreme (he asserts it
 can't be done at all, period).

 I don't think it makes sense to imply that I am defending an extreme
 position. I wasn't discussing design choices, truth is not a continuum
 and I was just objecting to the line of reasoning that went like
 "undecidability of the halting problem implies that formal reasoning is
 pointless.", which is clearly untrue.

I agree.

 My understanding is that it can be done
 but only with annotations or whole program analysis.
 ...

 Any way it is done, it doesn't come for free.

That's not a fair characterization. Interprocedural analysis is quite a 
different ball game from classic semantic checking.


Andrei

Jun 26 2014

"Andrew Godfrey" <X y.com> writes:

 My understanding is that it can be done
 but only with annotations or whole program analysis.



I think that's true but you don't necessarily have to annotate 
every function.

a) possibly there's something interesting to do at the module 
level. I think more than one thing. E.g. A module that doesn't 
have any callbacks in its interface is 'interesting'. E.g. 
'Layering' of modules.

b) Some situations are particularly dangerous and so a function 
annotation could be encouraged for those. E.g. If you have a 
recursive function without tail recursion, and the possible 
recursion depth is substantial, then while it is deep in its 
recursion, it should limit what other functions it calls. Someone 
could come along later and add a logging statement to it, which 
usually isn't dangerous but here it could be.


Quick sort is an instructive example because it has the security 
weakness that, although you expect the stack depth to typically 
be O(log n), an attacker in control of the input can force it to 
be O(n). Of course with tail recursion that doesn't threaten 
stack overflow, but it suggests that there are recursion cases we 
think are safe, and typically don't fall over, but are actually 
vulnerable. Which means if we don't feel like annotating them in 
defense, we're being irresponsible in a way.

Jul 22 2014

"H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:

On Thu, Jun 26, 2014 at 04:47:24PM -0700, Andrei Alexandrescu via Digitalmars-d
wrote:
 On 6/26/14, 4:16 PM, Timon Gehr wrote:
- Annotations can include a formal proof.

 
 If a function can be annotated with what other functions it calls
 (non-transitively), I agree that it can be guaranteed with local
 semantic checking that a program won't overflow. However requiring
 such annotations would be onerous, and deducing them would require
 whole program analysis.
 
We've had similar discussions before. Why do we _still_ have to argue
about the _possibility_ of having a system that is helpful for
proving stack overflows (or other bad behaviours) absent?

You can say "out of scope" or "not a priority" or "this should be
realized in third-party tool support" but not "it is impossible".

 
 I also seem to reckon Walter is in the other extreme (he asserts it
 can't be done at all, period). My understanding is that it can be done
 but only with annotations or whole program analysis.

[...]

I think the compiler should be able to tell, at least for the simplest
cases, whether something will definitely stop recursing. Proving that
something will *not* stop recursing is unsolvable in the general case,
but even if we restrict it to a solvable subset, it's still far from
trivial for the compiler to invent such a proof (unless we restrict it
so much that it excludes too many useful algorithms).

One approach is to have the user supply a proof that the compiler can
then check -- in general, if a proof exists at all, it should be
checkable. Such checks furthermore can probably be done fast enough so
as to not adversely affect current compilation times (unless the proof
is ridiculously complex, which for practical real-world applications I
don't think will happen).

However, absence of proof is not proof of absence; just because neither
the compiler nor the user is able to come up with a proof that something
will stop recursing, doesn't mean that it definitely won't stop
recursing. So the compiler cannot definitively reject the code as
definitely overflowing the stack. But we *can* make it so that the user
tells the compiler "please stop compilation if neither of us can supply
a proof that this function will stop recursing". But it has to be
opt-in, because there will be many real-world applications that cannot
be proven to stop recursing, even though in practice they always will,
so we cannot make this a language default.

One particular example that comes to mind is the compiler itself: as it
parses the input program, there is in theory no guarantee that the input
won't have an arbitrarily deep nesting, such that the AST generated by
the parser will be infinitely deep, because you cannot statically prove
that the input will terminate. You don't know if the input file is
actually connected to the output pipe of a program that prints an
infinite series of ever-deeper nested structs, for example.  However, in
practice, such input never occurs, so we generally don't worry about
such contrived possibilities.  But this does mean that there can be no
proof of termination of recursion, even user-supplied ones -- because
there *are* cases where the parser *will* recurse forever, even if it
never happens in practice. Not accounting for that possibility
invalidates the proof, so any proof will always be rejected.  Therefore,
it is impossible to prove that recursion in the compiler is finite, even
though in practice it always is.


T

-- 
"Life is all a great joke, but only the brave ever get the point." -- Kenneth
Rexroth

Jun 26 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 6/26/2014 4:16 PM, Timon Gehr wrote:
 On 06/26/2014 10:29 PM, Walter Bright wrote:
 When you're dealing with security issues, which is what this about,


 This is about _avoiding stack overflows_. It's written down literally in the
 quoted passage.

Check the title of this thread, the linked issues, and bearophile's comment 
bringing up stack overflows.

It's about security, not niceness.


 you'll need a guarantee about stack overflow. Adding annotations is not
 helpful with this because they are not checkable.
 ...

 Not true. Basic facts:

 - In practice, programs are constructed explicitly to fulfill a particular
 purpose and, in particular, if they do never overflow the stack, they usually
do
 so for a reason.

 - Reasoning can be formalized and formal proofs can be written down in such a
 way that a machine can check whether the proof is valid.

 - Annotations can include a formal proof.

 We've had similar discussions before. Why do we _still_ have to argue about the
 _possibility_ of having a system that is helpful for proving stack overflows
(or
 other bad behaviours) absent?

 You can say "out of scope" or "not a priority" or "this should be realized in
 third-party tool support" but not "it is impossible".

In the general case, it is impossible. And I suspect the subset of D programs 
that don't have indirection or recursion is so small as to not be worth the
bother.

I do know there are a class of applications, for example in critical embedded 
systems, were recursion and indirection, and even allocation, are not allowed. 
Using D in such cases would require eschewing using Phobos, and some other care 
taken, but that isn't the issue here, which is security vulnerabilities.


 Again, what WORKS is a runtime check.

 A runtime check will not avoid the problem, which is exhaustion of stack space.

We disagree on the problem. The problem I initiated this thread on is "security 
vulnerabilities". Terminating a program that overflows its stack is not a 
security vulnerability.

As for formal proofs, I'll let slip a guilty secret - I know so little about 
computer science proofs I wouldn't even recognize one if I saw one, let alone 
have the ability to craft one. So when you advocate formal proofs, directing it 
at me won't accomplish anything. To get formal proofs for D in the spec, in the 
code, in the compiler, I cannot deliver that. People like you are going to have 
to step forward and do them.

Jun 26 2014

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:

On Thursday, 26 June 2014 at 09:35:20 UTC, Walter Bright wrote:
 On 6/26/2014 2:19 AM, bearophile wrote:
 One kind of problem left is to avoid stack overflows.

 In general, stack overflow checking at compile time is the 
 halting problem. It needs a runtime check.

No, "is N bytes of stack is sufficient" is decidable if you don't 
count in the heap as an unlimited resource that preserves state. 
So it is not the halting problem. The halting problem would be 
"is K bytes of stack sufficient, for K in [0,infinity>".

You can also do a worst case analysis that requires you to 
specify too much stack, but proves that it is sufficient. E.g. 
set the recursive depth to the max number of nodes in a binary 
tree, even if you know that it will never get that deep.

But since you allow calls into C code you need extra annotations 
and probably have more pressing issues to deal with… (like GC).

Jun 29 2014

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:

On Thursday, 26 June 2014 at 09:35:20 UTC, Walter Bright wrote:
 Stack overflows are not safety problems when a guard page is 
 used past the end of the stack. Then, overflow checking is done 
 in hardware. Guard pages aren't currently used for fibers, so 
 overflows are a real danger there.

But a page is only 2K? So what happens if you skip more than 2K 
and never touch the guard page? Does D prove that the stack 
pointer is never moved more than 2K-1 without a read or write in 
that range?

Guard pages on a flat memory model are not as safe as a segmented 
memory model.

A compromise would be to inject runtime checks to see if there is 
sufficient stack space whenever you move the stack pointer and 
remove them when you can prove that there is enough room. E.g. 
collapse the checks into larger spans of stack space by 
propagating them upwards in the call chain.

Anyway, minimizing stack space is very useful for fibers in 
scientific simulations or real time systems since you want to be 
able to run as many as you can fit into memory. Each actor/agent 
could be very simple so I would not rule out the ability to prove 
it in most cases for some domains.

Jun 30 2014

"bearophile" <bearophileHUGS lycos.com> writes:

(Sorry for the very delayed answers.)

----------------

Walter Bright:

 Oh well, there goes about 90% of D programs and pretty much all 
 use of the D runtime library!

Right. On the other hand the need for such strict safety 
requirements that forbid stack overflows, is not present in 
desktop software, it's for uncommon systems, with limited 
resources, that need to work reliably, like some car/plane 
electronics.

So when you have such needs you are willing to throw away 90% of 
D libs/Phobos, and probably to not use d-runtime too :-) And in 
such cases it's good to have as much compile-time assurances as 
possible.


 And I suspect the subset of D programs
 that don't have indirection or recursion is so small as to not 
 be worth the bother.

 I do know there are a class of applications, for example in 
 critical embedded
 systems, were recursion and indirection, and even allocation, 
 are not allowed.
 Using D in such cases would require eschewing using Phobos, and 
 some other care
 taken,

Right. And people in such fields are looking for something safer 
and nicer than C + a ton of static tests (and many of them don't 
seem to love Ada).

----------------

Andrei Alexandrescu:

My understanding is that it can be done but only with 
annotations or whole program analysis.<

I agree. But people in (example) automotive industry could be 
(and probably are) willing to do both.

Thus is a specialized niche, so adding such annotations to D 
language seems too much. So I've suggested to add enough static 
introspection (and some way to attach compile-time semantics to 
UDAs) to allow specialized programmers to add to D those 
annotations (and leave other more complex tests to external 
tools).

----------------

H. S. Teoh:

 I think the compiler should be able to tell, at least for the 
 simplest
 cases, whether something will definitely stop recursing.

Whiley language (http://whiley.org/ ) is able to do this, with 
the "loop variant" you write.

----------------

deadalnix:

 2) By alloca();

 it is  system

I'd like to use something like alloca (or much better something 
similar to the variable-length stack-allocated 
partially-library-defined arrays of C++14/C++17) even in  safe 
code (but not in stack-constrained code we were discussing here).


 We should have a page reserved at the end of the stack so we can
 throw when reaching it. The compiler can ensure we don't bypass
 it in case 1.

OK.

----------------

Ola Fosheim Grøstad:

 A compromise would be to inject runtime checks to see if there 
 is
 sufficient stack space whenever you move the stack pointer

Run-time tests in debug mode seem useful.

Bye,
bearophile

Jul 10 2014

"deadalnix" <deadalnix gmail.com> writes:

On Monday, 30 June 2014 at 08:00:37 UTC, Ola Fosheim Grøstad
wrote:
 On Thursday, 26 June 2014 at 09:35:20 UTC, Walter Bright wrote:
 Stack overflows are not safety problems when a guard page is 
 used past the end of the stack. Then, overflow checking is 
 done in hardware. Guard pages aren't currently used for 
 fibers, so overflows are a real danger there.

 But a page is only 2K? So what happens if you skip more than 2K 
 and never touch the guard page? Does D prove that the stack 
 pointer is never moved more than 2K-1 without a read or write 
 in that range?

The compiler can ensure that you hit at least every 4k or so. It
doesn't look like a very hard constraint to have a volatile load
per untouched 4k of stack (which should be very rare).

Jul 11 2014

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:

On Friday, 11 July 2014 at 17:28:39 UTC, deadalnix wrote:
 The compiler can ensure that you hit at least every 4k or so. It
 doesn't look like a very hard constraint to have a volatile load
 per untouched 4k of stack (which should be very rare).

Sure, it should be possible to work it into the backend at the 
code-gen level.

For fibers without recursion I think the most powerful approach 
would be to do whole program analysis to establish a minimum of N 
for the typical case (without unbounded recursion). Of course, 
then alloca also have to extend the stack and do book keeping of 
the additional storage  taken by alloca() or simply have a 
separate alloc-stack.

Jul 12 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 7/11/2014 10:28 AM, deadalnix wrote:
 The compiler can ensure that you hit at least every 4k or so.

And it already does.

Jul 12 2014

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:

On Sunday, 13 July 2014 at 03:25:08 UTC, Walter Bright wrote:
 On 7/11/2014 10:28 AM, deadalnix wrote:
 The compiler can ensure that you hit at least every 4k or so.

 And it already does.

Doesn't look so:

     int foo(int a) {
         int[100] b = void;
         b[$-1] = a;
         return b[$-1];
     }

     int bar(int a) {
         int[8000] b = void;
         b[$-1] = a;
         return b[$-1];
     }



Disassembly of section .text.int bigstack.foo(int):

0000000000000000 <int bigstack.foo(int)>:
    0:   55                      push   %rbp
    1:   48 8b ec                mov    %rsp,%rbp
    4:   48 81 ec a0 01 00 00    sub    $0x1a0,%rsp
    b:   48 8d 45 f4             lea    -0xc(%rbp),%rax
    f:   89 38                   mov    %edi,(%rax)
   11:   8b 45 f4                mov    -0xc(%rbp),%eax
   14:   c9                      leaveq
   15:   c3                      retq
   16:   66 90                   xchg   %ax,%ax

Disassembly of section .text.int bigstack.bar(int):

0000000000000000 <int bigstack.bar(int)>:
    0:   55                      push   %rbp
    1:   48 8b ec                mov    %rsp,%rbp
    4:   48 81 ec 10 7d 00 00    sub    $0x7d10,%rsp
    b:   48 8d 45 f4             lea    -0xc(%rbp),%rax
    f:   89 38                   mov    %edi,(%rax)
   11:   8b 45 f4                mov    -0xc(%rbp),%eax
   14:   c9                      leaveq
   15:   c3                      retq
   16:   66 90                   xchg   %ax,%ax

(This is with DMD master on Linux x86_64. Same with -m32.)

Jul 13 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 7/13/2014 4:04 AM, "Marc Schütz" <schuetzm gmx.net>" wrote:
 On Sunday, 13 July 2014 at 03:25:08 UTC, Walter Bright wrote:
 On 7/11/2014 10:28 AM, deadalnix wrote:
 The compiler can ensure that you hit at least every 4k or so.

 And it already does.

 Doesn't look so:

      int bar(int a) {
          int[8000] b = void;
          b[$-1] = a;
          return b[$-1];
      }

On Win32:

_D4foo53barFiZi comdat
         assume  CS:_D4foo53barFiZi
                 push    EBP
                 mov     EBP,ESP
                 mov     EDX,7
L8:             sub     ESP,01000h
                 test    [ESP],ESP
                 dec     EDX
                 jne     L8
                 sub     ESP,0D04h
                 lea     ECX,-8[EBP]
                 mov     [ECX],EAX
                 mov     EAX,-8[EBP]
                 leave
                 ret

It doesn't do it on Linux because gcc doesn't do it. But the capability is in 
the back end, and it does it for alloca(), too.

Jul 13 2014

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:

On Sunday, 13 July 2014 at 23:35:46 UTC, Walter Bright wrote:
 On 7/13/2014 4:04 AM, "Marc Schütz" <schuetzm gmx.net>" wrote:
 On Sunday, 13 July 2014 at 03:25:08 UTC, Walter Bright wrote:
 On 7/11/2014 10:28 AM, deadalnix wrote:
 The compiler can ensure that you hit at least every 4k or so.

 And it already does.

 Doesn't look so:

     int bar(int a) {
         int[8000] b = void;
         b[$-1] = a;
         return b[$-1];
     }

 On Win32:

 _D4foo53barFiZi comdat
         assume  CS:_D4foo53barFiZi
                 push    EBP
                 mov     EBP,ESP
                 mov     EDX,7
 L8:             sub     ESP,01000h
                 test    [ESP],ESP
                 dec     EDX
                 jne     L8
                 sub     ESP,0D04h
                 lea     ECX,-8[EBP]
                 mov     [ECX],EAX
                 mov     EAX,-8[EBP]
                 leave
                 ret

 It doesn't do it on Linux because gcc doesn't do it. But the 
 capability is in the back end, and it does it for alloca(), too.

Hmm... but this using DMD, not GDC. Or do you mean that DMD 
doesn't do it, because GCC doesn't do it either? If so, what is 
the reason for this? Why shouldn't this feature be enabled on 
every platform?

Jul 14 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 7/14/2014 3:07 AM, "Marc Schütz" <schuetzm gmx.net>" wrote:
 Hmm... but this using DMD, not GDC. Or do you mean that DMD doesn't do it,
 because GCC doesn't do it either?

Right.


 If so, what is the reason for this? Why
 shouldn't this feature be enabled on every platform?

There are often undocumented reasons why gcc generates code in certain ways,
and 
I figured that being compatible was the best strategy. Every time I deviate, it 
comes back eventually as a bug report.

Jul 14 2014

"Andrew Godfrey" <X y.com> writes:

Returning to the earlier question, of helping to avoid stack
overflows:
I can easily think of two things the language could do.
(Forgive me if D already has them - I have read a fair amount but
not all the minutiae.)

1) A function annotation that means "I will call myself
recursively, and when I do, I expect the tail recursion
optimization." I have seen code which allocates something big on
the stack and depends on the optimization. So this intent should
be expressible.

2) Annotations about when a function does not expect re-entrancy
to be possible based on call-graph analysis. This would obviously
have to restrict what it can do in order to be feasible, but
wouldn't it still be useful? Besides - the cases where it is hard
for the compiler to detect the possibility of re-entrancy are
often cases where humans have trouble noticing it too.

Jul 19 2014

"bearophile" <bearophileHUGS lycos.com> writes:

Andrew Godfrey:

 1) A function annotation that means "I will call myself
 recursively, and when I do, I expect the tail recursion
 optimization." I have seen code which allocates something big on
 the stack and depends on the optimization. So this intent should
 be expressible.

A  tailrec annotation seems good, and will improve the functional 
usages of D. It should stop compilation with a error if the 
function doesn't call itself or if the compiler is not able to 
remove the recursive call. This means it has to be fully enforced.


 2) Annotations about when a function does not expect re-entrancy
 to be possible based on call-graph analysis.

I don't understand. Assuming I know this 
(http://en.wikipedia.org/wiki/Reentrancy_%28computing%29 ) can 
you show an example?

Bye,
bearophile

Jul 19 2014

"bearophile" <bearophileHUGS lycos.com> writes:

 Andrew Godfrey:

 1) A function annotation that means "I will call myself
 recursively, and when I do, I expect the tail recursion
 optimization." I have seen code which allocates something big 
 on
 the stack and depends on the optimization. So this intent 
 should
 be expressible.

 A  tailrec annotation seems good, and will improve the 
 functional usages of D. It should stop compilation with a error 
 if the function doesn't call itself or if the compiler is not 
 able to remove the recursive call. This means it has to be 
 fully enforced.

Perhaps a  cps (or  continuation) annotation is better and more 
general.

Bye,
bearophile

Jul 19 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 7/19/2014 11:06 PM, bearophile wrote:
 A  tailrec annotation seems good, and will improve the functional usages of D.
 It should stop compilation with a error if the function doesn't call itself or
 if the compiler is not able to remove the recursive call. This means it has to
 be fully enforced.

If you want to guarantee replacement of a recursive call with a loop, just
write 
a loop.

Jul 19 2014

"bearophile" <bearophileHUGS lycos.com> writes:

Walter Bright:

 If you want to guarantee replacement of a recursive call with a 
 loop, just write a loop.

There are cases where a recursive algorithm is nicer. And people 
that want to use D functionally, may also be used to writing code 
recursively.

What about the  continuation 
(http://en.wikipedia.org/wiki/Continuation-passing_style )?

Bye,
bearophile

Jul 19 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 7/19/2014 11:55 PM, bearophile wrote:
 Walter Bright:

 If you want to guarantee replacement of a recursive call with a loop, just
 write a loop.

 There are cases where a recursive algorithm is nicer. And people that want to
 use D functionally, may also be used to writing code recursively.

I doubt they'll want to use an  tailrec attribute.


 What about the  continuation
 (http://en.wikipedia.org/wiki/Continuation-passing_style )?

I doubt they'll want to use that attribute, either.


In any case, D supports more styles of programming than any other language I
can 
think of. I doubt adding even more will be that helpful.

Jul 20 2014

"bearophile" <bearophileHUGS lycos.com> writes:

Walter Bright:

 I doubt they'll want to use an  tailrec attribute.

In Scala there is " tailrec":
http://www.scala-lang.org/api/current/scala/annotation/tailrec.html


http://msdn.microsoft.com/en-us/library/dd233232.aspx

http://caml.inria.fr/pub/docs/manual-ocaml-400/manual003.html#toc4

In Clojure there is "recur" (that is not an annotation):
http://clojure.org/special_forms?responseToken=08ea4841337f67bb8f07663aa70b03aca#recur

I think functional programmers are willing to use  tailrec 
attribute if it's well designed and it does what's written on its 
tin.


 What about the  continuation
 (http://en.wikipedia.org/wiki/Continuation-passing_style )?

 I doubt they'll want to use that attribute, either.

I don't know. It's more general than the  tailrec, but probably 
many C and C++ and Java programmers don't even know what it is. 
But it allows a programming style that in some case is 
interesting (far cleaner than computed gotos).


 In any case, D supports more styles of programming than any 
 other language I can think of. I doubt adding even more will be 
 that helpful.

I think a basic form of pattern matching implemented with the 
switch construct is a good idea for D.

Bye,
bearophile

Jul 20 2014

"deadalnix" <deadalnix gmail.com> writes:

On Sunday, 20 July 2014 at 07:27:36 UTC, Walter Bright wrote:
 What about the  continuation
 (http://en.wikipedia.org/wiki/Continuation-passing_style )?

 I doubt they'll want to use that attribute, either.

Especially if that can be done with AST macros :D

Jul 21 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

20-Jul-2014 10:45, Walter Bright пишет:
 On 7/19/2014 11:06 PM, bearophile wrote:
 A  tailrec annotation seems good, and will improve the functional
 usages of D.
 It should stop compilation with a error if the function doesn't call
 itself or
 if the compiler is not able to remove the recursive call. This means
 it has to
 be fully enforced.

 If you want to guarantee replacement of a recursive call with a loop,
 just write a loop.

Functional programming is full of simple recursion and it would be nice 
not to stack overflow in debug builds.

Another use case is so-called threaded code interpreter, that can be 
done with either computed gotos (and bunch of labels) or forced tail 
calls (and bunch of functions). In both cases computed jump or tail call 
is indirect.

-- 
Dmitry Olshansky

Jul 20 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 7/20/2014 3:27 AM, Dmitry Olshansky wrote:
 Functional programming is full of simple recursion and it would be nice not to
 stack overflow in debug builds.

Traditional FP languages don't have loops, and so must do recursion. D has 
loops, even in pure functions, there's no reason not to use them.


 Another use case is so-called threaded code interpreter, that can be done with
 either computed gotos (and bunch of labels) or forced tail calls (and bunch of
 functions). In both cases computed jump or tail call is indirect.

http://eli.thegreenplace.net/2012/07/12/computed-goto-for-efficient-dispatch-tables/

The computed goto is faster for two reasons, according to the article:

1.The switch does a bit more per iteration because of bounds checking.

2.The effects of hardware branch prediction.


(1) is eliminated with final switch. I know this optimization is not done with 
dmd, but D doesn't need a new language feature because of that.

(2) with a bit more work, this could also be implemented as a compiler
optimization.


Both of these are worthy of bugzilla enhancement requests. I'd much rather 
implement improvements as better code generation rather than more language
features.

Jul 20 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 7/20/2014 1:50 PM, Walter Bright wrote:
 Both of these are worthy of bugzilla enhancement requests.

Amazingly, these suddenly appeared:

https://issues.dlang.org/show_bug.cgi?id=13169
https://issues.dlang.org/show_bug.cgi?id=13170

Jul 20 2014

"bearophile" <bearophileHUGS lycos.com> writes:

Walter Bright:

 https://issues.dlang.org/show_bug.cgi?id=13169
 https://issues.dlang.org/show_bug.cgi?id=13170

Are such optimizations portable and guaranteed on all D 
compilers? If the answer is negative, then they can't replace a 
_standard_ D syntax for computed gotos.

Bye,
bearophile

Jul 20 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 7/20/2014 2:38 PM, bearophile wrote:
 Are such optimizations portable and guaranteed on all D compilers? If the
answer
 is negative, then they can't replace a _standard_ D syntax for computed gotos.

C'mon, bearophile. Optimizations are always implementation dependent. People 
rely on them for fast code. This isn't any different.

Jul 20 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

21-Jul-2014 00:50, Walter Bright пишет:
 On 7/20/2014 3:27 AM, Dmitry Olshansky wrote:
 Functional programming is full of simple recursion and it would be
 nice not to
 stack overflow in debug builds.

 Traditional FP languages don't have loops, and so must do recursion. D
 has loops, even in pure functions, there's no reason not to use them.

 Another use case is so-called threaded code interpreter, that can be
 done with
 either computed gotos (and bunch of labels) or forced tail calls (and
 bunch of
 functions). In both cases computed jump or tail call is indirect.

 http://eli.thegreenplace.net/2012/07/12/computed-goto-for-efficient-dispatch-tables/

Actually that blog entry is wrong about it. Computed goto won't help 
when used as direct replacement of switch and you are correct in that 
the said code could be easily optimized with final switch.

What would help a lot is thread-code interpreter, that is bytecode 
contains direct addresses used in computed goto.

 The computed goto is faster for two reasons, according to the article:

 1.The switch does a bit more per iteration because of bounds checking.

Now let's consider proper implementation of thread-code interpreter.
where *code pointer points to an array of addresses. We've been through 
this before and it turns out switch is slower because of an extra load.

a) Switch does 1 load for opcode, 1 load for the jump table, 1 indirect 
jump to advance
(not even counting bounds checking of the switch)

b) Threaded-code via (say) computed goto does 1 load of opcode and 1 
indirect jump, because opcode is an address already (so there is no 
separate jump table).

I'm certain that forced tail call would work just fine instead of 
computed goto for this scenario. In fact I've measured this with LDC and 
the results are promising but only work with -O2/-O3 (where tail call is 
optimized). I'd gladly dig them up if you are interested.


-- 
Dmitry Olshansky

Jul 20 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 7/20/2014 3:10 PM, Dmitry Olshansky wrote:
 The computed goto is faster for two reasons, according to the article:

 1.The switch does a bit more per iteration because of bounds checking.

 Now let's consider proper implementation of thread-code interpreter.
 where *code pointer points to an array of addresses. We've been through this
 before and it turns out switch is slower because of an extra load.

 a) Switch does 1 load for opcode, 1 load for the jump table, 1 indirect jump to
 advance
 (not even counting bounds checking of the switch)

 b) Threaded-code via (say) computed goto does 1 load of opcode and 1 indirect
 jump, because opcode is an address already (so there is no separate jump
table).

True, but I'd like to find a way that this can be done as an optimization.


 I'm certain that forced tail call would work just fine instead of computed goto
 for this scenario. In fact I've measured this with LDC and the results are
 promising but only work with -O2/-O3 (where tail call is optimized). I'd gladly
 dig them up if you are interested.

I'm pretty reluctant to add language features that can be done as optimizations.

Jul 20 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

21-Jul-2014 02:29, Walter Bright пишет:
 On 7/20/2014 3:10 PM, Dmitry Olshansky wrote:
 The computed goto is faster for two reasons, according to the article:

 1.The switch does a bit more per iteration because of bounds checking.

 Now let's consider proper implementation of thread-code interpreter.
 where *code pointer points to an array of addresses. We've been
 through this
 before and it turns out switch is slower because of an extra load.

 a) Switch does 1 load for opcode, 1 load for the jump table, 1
 indirect jump to
 advance
 (not even counting bounds checking of the switch)

 b) Threaded-code via (say) computed goto does 1 load of opcode and 1
 indirect
 jump, because opcode is an address already (so there is no separate
 jump table).

 True, but I'd like to find a way that this can be done as an optimization.

I found a way but that relies on tail-call optimization, otherwise it 
would overflow stack. I would rather find some way that works without -O 
flag.

In fact it brings another unrelated problem with Phobos: any 
template-heavy libraries have amazingly awful speeds w/o inlining & 
optimization enabled _by the client_. It should be the same with C++ though.

 I'm certain that forced tail call would work just fine instead of
 computed goto
 for this scenario. In fact I've measured this with LDC and the results
 are
 promising but only work with -O2/-O3 (where tail call is optimized).
 I'd gladly
 dig them up if you are interested.

 I'm pretty reluctant to add language features that can be done as
 optimizations.

The point is - software that only works in release build is kind of hard 
to develop, even more so with libraries. Thus I'm in opposition to 
labeling such things as optimization when they, in fact, change semantics.

-- 
Dmitry Olshansky

Jul 23 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 7/23/2014 1:06 PM, Dmitry Olshansky wrote:
 The point is - software that only works in release build is kind of hard to
 develop, even more so with libraries. Thus I'm in opposition to labeling such
 things as optimization when they, in fact, change semantics.

-release, -inline, -O and -g are all separate flags for good reasons. You can 
mix and match as convenient for what you're trying to accomplish.

Jul 23 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

24-Jul-2014 00:14, Walter Bright пишет:
 On 7/23/2014 1:06 PM, Dmitry Olshansky wrote:
 The point is - software that only works in release build is kind of
 hard to
 develop, even more so with libraries. Thus I'm in opposition to
 labeling such
 things as optimization when they, in fact, change semantics.

 -release, -inline, -O and -g are all separate flags for good reasons.
 You can mix and match as convenient for what you're trying to accomplish.

Unless you are library writer, and then it's user who mixes and matches 
flags and you are foobared unless it works with them all :)

-- 
Dmitry Olshansky

Jul 23 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 7/23/2014 1:20 PM, Dmitry Olshansky wrote:
 Unless you are library writer, and then it's user who mixes and matches flags
 and you are foobared unless it works with them all :)

I think it's a generally unworkable rabbit hole if we add language features to 
turn on/off various optimizations.

Jul 23 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

24-Jul-2014 00:43, Walter Bright пишет:
 On 7/23/2014 1:20 PM, Dmitry Olshansky wrote:
 Unless you are library writer, and then it's user who mixes and
 matches flags
 and you are foobared unless it works with them all :)

 I think it's a generally unworkable rabbit hole if we add language
 features to turn on/off various optimizations.

That's true. I'm thinking of how to solve things with minimal extension 
to the language that is generally useful (unlike compute-goto which is 
very niche and dangerous).

Obviously optimizations won't solve this problem, with that said it 
would be awesome to have your 2 final switch enhancements.

-- 
Dmitry Olshansky

Jul 23 2014

"bearophile" <bearophileHUGS lycos.com> writes:

Dmitry Olshansky:

(unlike compute-goto which is very niche and dangerous).

It's a niche that system languages are the few fit for. Even 
CPython (and other interpreters) are using that feature.

Bye,
bearophile

Jul 23 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

24-Jul-2014 01:05, bearophile пишет:
 Dmitry Olshansky:

 (unlike compute-goto which is very niche and dangerous).

 It's a niche that system languages are the few fit for. Even CPython
 (and other interpreters) are using that feature.

All of fast portable interpreters i.e. not JITs.
It's still a niche, and a feature should have uses beyond that or be 
better tailored for this use case.

Forced tail call was my best shot - it's useful for functional 
programming and it works for fast interpreters.

-- 
Dmitry Olshansky

Jul 23 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 7/23/2014 1:53 PM, Dmitry Olshansky wrote:
 Obviously optimizations won't solve this problem, with that said it would be
 awesome to have your 2 final switch enhancements.

Yes. Feel free to ping me if I forget about them.

Also, I particularly like the first of them because it's not available to C/C++.

Jul 23 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 7/23/14, 2:41 PM, Walter Bright wrote:
 On 7/23/2014 1:53 PM, Dmitry Olshansky wrote:
 Obviously optimizations won't solve this problem, with that said it
 would be
 awesome to have your 2 final switch enhancements.

 Yes. Feel free to ping me if I forget about them.

bugzilla -- Andrei

Jul 23 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 7/23/2014 2:42 PM, Andrei Alexandrescu wrote:
 On 7/23/14, 2:41 PM, Walter Bright wrote:
 On 7/23/2014 1:53 PM, Dmitry Olshansky wrote:
 Obviously optimizations won't solve this problem, with that said it
 would be
 awesome to have your 2 final switch enhancements.

 Yes. Feel free to ping me if I forget about them.

 bugzilla -- Andrei


They're already there:

https://issues.dlang.org/show_bug.cgi?id=13169
https://issues.dlang.org/show_bug.cgi?id=13170

But there's so much in bugzilla I often overlook things.

Jul 23 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 7/23/14, 1:14 PM, Walter Bright wrote:
 On 7/23/2014 1:06 PM, Dmitry Olshansky wrote:
 The point is - software that only works in release build is kind of
 hard to
 develop, even more so with libraries. Thus I'm in opposition to
 labeling such
 things as optimization when they, in fact, change semantics.

 -release, -inline, -O and -g are all separate flags for good reasons.
 You can mix and match as convenient for what you're trying to accomplish.

... and -noboundscheck. -- Andrei

Jul 23 2014

Timon Gehr <timon.gehr gmx.ch> writes:

On 07/20/2014 10:50 PM, Walter Bright wrote:
 On 7/20/2014 3:27 AM, Dmitry Olshansky wrote:
 Functional programming is full of simple recursion and it would be
 nice not to
 stack overflow in debug builds.

 Traditional FP languages don't have loops, and so must do recursion.

Uh...

 D has loops, even in pure functions,

So does Haskell.

import Control.Monad
import Control.Monad.ST
import Data.STRef

factorial :: Integer -> Integer
factorial n = runST $ do
   r <- newSTRef 1
   forM_ [1..n] $ \i->
     modifySTRef r (*i)
   readSTRef r

main = print $ factorial 5 -- "120"


 there's no reason not to use them.
 ...

But of course there are reasons to use tail calls.

Jul 20 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 7/20/14, 8:15 PM, Timon Gehr wrote:
 On 07/20/2014 10:50 PM, Walter Bright wrote:
 On 7/20/2014 3:27 AM, Dmitry Olshansky wrote:
 Functional programming is full of simple recursion and it would be
 nice not to
 stack overflow in debug builds.

 Traditional FP languages don't have loops, and so must do recursion.

 Uh...

 D has loops, even in pure functions,

 So does Haskell.

 import Control.Monad
 import Control.Monad.ST
 import Data.STRef

 factorial :: Integer -> Integer
 factorial n = runST $ do
    r <- newSTRef 1
    forM_ [1..n] $ \i->
      modifySTRef r (*i)
    readSTRef r

 main = print $ factorial 5 -- "120"

And that doesn't look awkward at all :o). -- Andrei

Jul 20 2014

Timon Gehr <timon.gehr gmx.ch> writes:

On 07/21/2014 06:40 AM, Andrei Alexandrescu wrote:
 On 7/20/14, 8:15 PM, Timon Gehr wrote:
 On 07/20/2014 10:50 PM, Walter Bright wrote:
 On 7/20/2014 3:27 AM, Dmitry Olshansky wrote:
 Functional programming is full of simple recursion and it would be
 nice not to
 stack overflow in debug builds.

 Traditional FP languages don't have loops, and so must do recursion.

 Uh...

 D has loops, even in pure functions,

 So does Haskell.

 import Control.Monad
 import Control.Monad.ST
 import Data.STRef

 factorial :: Integer -> Integer
 factorial n = runST $ do
    r <- newSTRef 1
    forM_ [1..n] $ \i->
      modifySTRef r (*i)
    readSTRef r

 main = print $ factorial 5 -- "120"

 And that doesn't look awkward at all :o). -- Andrei

Indeed.

This just creates a variable, modifies it 'n' times in a loop and then 
reads the result.

Jul 21 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 7/21/14, 3:22 PM, Timon Gehr wrote:
 On 07/21/2014 06:40 AM, Andrei Alexandrescu wrote:
 On 7/20/14, 8:15 PM, Timon Gehr wrote:
 On 07/20/2014 10:50 PM, Walter Bright wrote:
 On 7/20/2014 3:27 AM, Dmitry Olshansky wrote:
 Functional programming is full of simple recursion and it would be
 nice not to
 stack overflow in debug builds.

 Traditional FP languages don't have loops, and so must do recursion.

 Uh...

 D has loops, even in pure functions,

 So does Haskell.

 import Control.Monad
 import Control.Monad.ST
 import Data.STRef

 factorial :: Integer -> Integer
 factorial n = runST $ do
    r <- newSTRef 1
    forM_ [1..n] $ \i->
      modifySTRef r (*i)
    readSTRef r

 main = print $ factorial 5 -- "120"

 And that doesn't look awkward at all :o). -- Andrei

 Indeed.

Code sucks. It's its own destruction. -- Andrei

Jul 22 2014

Timon Gehr <timon.gehr gmx.ch> writes:

On 07/22/2014 05:49 PM, Andrei Alexandrescu wrote:
 On 7/21/14, 3:22 PM, Timon Gehr wrote:
 On 07/21/2014 06:40 AM, Andrei Alexandrescu wrote:
 On 7/20/14, 8:15 PM, Timon Gehr wrote:
 On 07/20/2014 10:50 PM, Walter Bright wrote:
 ...
 D has loops, even in pure functions,

 So does Haskell.

 import Control.Monad
 import Control.Monad.ST
 import Data.STRef

 factorial :: Integer -> Integer
 factorial n = runST $ do
    r <- newSTRef 1
    forM_ [1..n] $ \i->
      modifySTRef r (*i)
    readSTRef r

 main = print $ factorial 5 -- "120"

 And that doesn't look awkward at all :o). -- Andrei

 Indeed.

 Code sucks. It's its own destruction. -- Andrei

It is easy to understand and works correctly, but of course this is not 
a canonical implementation of factorial.

Jul 22 2014

"bearophile" <bearophileHUGS lycos.com> writes:

Timon Gehr:

 It is easy to understand and

It's easy to understand because you are very intelligent Timon :-)

Bye,
bearophile

Jul 22 2014

"bearophile" <bearophileHUGS lycos.com> writes:

Andrei Alexandrescu:

 And that doesn't look awkward at all :o). -- Andrei

A related thread:
http://stackoverflow.com/questions/6622524/why-is-haskell-sometimes-referred-to-as-best-imperative-language

Bye,
bearophile

Jul 21 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 7/20/2014 8:15 PM, Timon Gehr wrote:
 On 07/20/2014 10:50 PM, Walter Bright wrote:
 On 7/20/2014 3:27 AM, Dmitry Olshansky wrote:
 Functional programming is full of simple recursion and it would be
 nice not to
 stack overflow in debug builds.

 Traditional FP languages don't have loops, and so must do recursion.

 Uh...

 D has loops, even in pure functions,

 So does Haskell.

 import Control.Monad

Uh-oh! Monad! :-)

Jul 20 2014

Timon Gehr <timon.gehr gmx.ch> writes:

On 07/21/2014 06:56 AM, Walter Bright wrote:
 On 7/20/2014 8:15 PM, Timon Gehr wrote:
 On 07/20/2014 10:50 PM, Walter Bright wrote:
 On 7/20/2014 3:27 AM, Dmitry Olshansky wrote:
 Functional programming is full of simple recursion and it would be
 nice not to
 stack overflow in debug builds.

 Traditional FP languages don't have loops, and so must do recursion.

 Uh...

 D has loops, even in pure functions,

 So does Haskell.

 import Control.Monad

 Uh-oh! Monad! :-)

The example just uses the ST Monad which is quite similar to weakly pure 
statements in D.

Jul 21 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 7/21/2014 3:25 PM, Timon Gehr wrote:
 The example just uses the ST Monad which is quite similar to weakly pure
 statements in D.

D doesn't have weakly pure statements - it has weakly pure functions.

Jul 21 2014

Timon Gehr <timon.gehr gmx.ch> writes:

On 07/22/2014 05:02 AM, Walter Bright wrote:
 On 7/21/2014 3:25 PM, Timon Gehr wrote:
 The example just uses the ST Monad which is quite similar to weakly pure
 statements in D.

 D doesn't have weakly pure statements - it has weakly pure functions.

The type checker distinguishes between statements that are allowed in 
'pure'-qualified functions and statements that are not.

Jul 22 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 7/20/2014 8:15 PM, Timon Gehr wrote:
 So does Haskell.

 import Control.Monad
 import Control.Monad.ST
 import Data.STRef

 factorial :: Integer -> Integer
 factorial n = runST $ do
    r <- newSTRef 1
    forM_ [1..n] $ \i->
      modifySTRef r (*i)
    readSTRef r

Would you rather write that or:

   int factorial(int i) pure {
     auto r = 1;
     foreach (i; 1..n + 1)
	r *= i;
     return r;
   }

And I'd love to see what native code is generated for the Haskell example.

Jul 22 2014

"bearophile" <bearophileHUGS lycos.com> writes:

Walter Bright:

 Would you rather write that or:

How much experience do you have in writing Haskell programs 
longer than 50 lines of code? I have a large respect for your 
experience and your intelligence, but be careful with comments 
like that, to not show just your ignorance...

If you are a Haskell programmer you usually don't write code like 
that, so in practice it's not a problem. And even if sometimes 
you have to write awkward code, the Haskell programmers can 
assure you that on average the advantages given by Haskell purity 
far outweighs the disadvantages.

On overall I prefer D over Haskell, but if you want to criticize 
Haskell you need much much better arguments and way bigger 
cannons :-)

Bye,
bearophile

Jul 22 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 7/22/2014 11:44 AM, bearophile wrote:
 Walter Bright:

 Would you rather write that or:

 How much experience do you have in writing Haskell programs longer than 50
lines
 of code?

Zero.


 I have a large respect for your experience and your intelligence, but
 be careful with comments like that, to not show just your ignorance...

 If you are a Haskell programmer you usually don't write code like that, so in
 practice it's not a problem.

Does that imply you agree with my earlier point, then, that D doesn't need tail 
recursion because it can write loops instead?


 And even if sometimes you have to write awkward
 code, the Haskell programmers can assure you that on average the advantages
 given by Haskell purity far outweighs the disadvantages.

 On overall I prefer D over Haskell, but if you want to criticize Haskell you
 need much much better arguments and way bigger cannons :-)

The point was not at all to criticize Haskell. The point was that D does not 
need tail recursion because D supports writing loop constructs. Showing how 
loops could be written using awkward monads in Haskell that Haskell programmers 
wouldn't write does not change that point.

Jul 22 2014

Timon Gehr <timon.gehr gmx.ch> writes:

On 07/22/2014 09:42 PM, Walter Bright wrote:
 On overall I prefer D over Haskell, but if you want to criticize
 Haskell you
 need much much better arguments and way bigger cannons :-)

 The point was not at all to criticize Haskell. The point was that D does
 not need tail recursion because D supports writing loop constructs.

Tail call support is still useful. Looping is not the main reason for 
supporting tail calls. Eg. a tail call might be indirect and only 
sometimes recurse on the same function and sometimes call another function.

 Showing how loops could be written using awkward

The point is not to criticize? :-)

 monads  in Haskell that
 Haskell programmers wouldn't write does not change that point.

Sure, that part of the discussion grew out of the claim that there are 
"no loops" there.

Jul 22 2014

Timon Gehr <timon.gehr gmx.ch> writes:

On 07/22/2014 11:23 PM, Timon Gehr wrote:
 Looping is not the main reason for supporting tail calls.

(In D!)

Jul 22 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 7/22/2014 2:23 PM, Timon Gehr wrote:
 On 07/22/2014 09:42 PM, Walter Bright wrote:
 Showing how loops could be written using awkward

 The point is not to criticize? :-)

Yes. If you want to infer criticism of Haskell from that, go ahead, but none
was 
intended. You can write awkward code in any language.


 monads  in Haskell that
 Haskell programmers wouldn't write does not change that point.

 Sure, that part of the discussion grew out of the claim that there are "no
 loops" there.

Bending a language over backwards to be pedantic is pointless for this 
discussion. Nobody claims "Obfuscated C Contest" entries are best (or even 
suggested) practices.

Jul 22 2014

Timon Gehr <timon.gehr gmx.ch> writes:

On 07/23/2014 12:14 AM, Walter Bright wrote:

 Sure, that part of the discussion grew out of the claim that there are
 "no
 loops" there.

 Bending a language over backwards

Not what happened.

 to be pedantic

Being pedantic is important enough.

 is pointless for this discussion.

If one doesn't want "pointless" "pedantry" my suggestion would be to not 
make wrong claims about apparently irrelevant things, or to ignore the 
rebuttal to communicate silent agreement.

 Nobody claims "Obfuscated C Contest" entries are best (or
 even suggested) practices.

Irrelevant. I'll leave here.

Jul 22 2014

"Brian Rogoff" <brogoff gmail.com> writes:

On Tuesday, 22 July 2014 at 21:23:33 UTC, Timon Gehr wrote:
 On 07/22/2014 09:42 PM, Walter Bright wrote:
 The point was not at all to criticize Haskell. The point was 
 that D does
 not need tail recursion because D supports writing loop 
 constructs.

 Tail call support is still useful. Looping is not the main 
 reason for supporting tail calls. Eg. a tail call might be 
 indirect and only sometimes recurse on the same function and 
 sometimes call another function.

Indeed, I recall Matthias Felleisen wrote that OOP makes no sense 
without TCO because methods will indirectly call each other. See

https://blogs.oracle.com/jrose/entry/tail_calls_in_the_vm#comment-1259314984000
http://www.eighty-twenty.org/index.cgi/tech/oo-tail-calls-20111001.html

While I don't care much for OOP, D is supposed to support that 
style.

Jul 22 2014

"H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:

On Tue, Jul 22, 2014 at 11:34:20AM -0700, Walter Bright via Digitalmars-d wrote:
 On 7/20/2014 8:15 PM, Timon Gehr wrote:
So does Haskell.

import Control.Monad
import Control.Monad.ST
import Data.STRef

factorial :: Integer -> Integer
factorial n = runST $ do
   r <- newSTRef 1
   forM_ [1..n] $ \i->
     modifySTRef r (*i)
   readSTRef r

 
 Would you rather write that or:
 
   int factorial(int i) pure {
     auto r = 1;
     foreach (i; 1..n + 1)
 	r *= i;
     return r;
   }
 
 And I'd love to see what native code is generated for the Haskell
 example.

I'd rather write:

	int factorial(int n) pure {
		return reduce!((a,b) => a*b)(1, iota(2, n));
	}

;-)

With gdc -O3, this produces almost the same code as your looped version.
But I can barely read the disassembly, because it appears that gdc -O3
triggers aggressive vectorization, so there are tons of instructions
manipulating xmm registers and the loop is unrolled by 8 iterations.

I have this lurking suspicion that this may actually perform more poorly
than the na�ve version for small values of n. :-) DMD, with -release -O
-inline, was unable to inline the call to reduce, but was at least able
to optimize reduce+iota into the equivalent of a simple foreach loop. So
it looks quite promising.

We should push for more aggressive optimizations of functional-style
constructs in D; I think there's a lot of potential here.


T

-- 
My program has no bugs! Only unintentional features...

Jul 22 2014

"deadalnix" <deadalnix gmail.com> writes:

On Tuesday, 22 July 2014 at 19:12:31 UTC, H. S. Teoh via
Digitalmars-d wrote:
 I'd rather write:

 	int factorial(int n) pure {
 		return reduce!((a,b) => a*b)(1, iota(2, n));
 	}

 ;-)

This is how we fucked UFCS.

Jul 22 2014

Timon Gehr <timon.gehr gmx.ch> writes:

On 07/22/2014 08:34 PM, Walter Bright wrote:
 On 7/20/2014 8:15 PM, Timon Gehr wrote:
 So does Haskell.

 import Control.Monad
 import Control.Monad.ST
 import Data.STRef

 factorial :: Integer -> Integer
 factorial n = runST $ do
    r <- newSTRef 1
    forM_ [1..n] $ \i->
      modifySTRef r (*i)
    readSTRef r

 Would you rather write that or:

    int factorial(int i) pure {
      auto r = 1;
      foreach (i; 1..n + 1)
      r *= i;
      return r;
    }
 ...

If I fix the typo, this computes the same numbers for inputs int.min up 
to and including 12.

 And I'd love to see what native code is generated for the Haskell example.

Well, the _actually_ corresponding D code using BigInt and DMD is 
significantly slower than what GHC generates here.

In any case, I don't think that this kind of mutation-heavy code is a 
core focus of GHC, so the assembly will probably not be nearly as well 
optimized as it could be.

But if you'd like to compare the offers of D and Haskell a little 
further, consider the following cute code, which I wrote in 5 minutes 
and which computes the 1000000th natural number that is a product of 
powers of 2, 3 and 5 well below a second on my machine:

merge :: Ord a => [a] -> [a] -> [a]
merge xs [] = xs
merge [] xs = xs
merge xxs (x:xs) yys (y:ys) =
   case compare x y of
     LT -> x : merge xs yys
     EQ -> x : merge xs ys
     GT -> y : merge xxs ys

hamming :: [Integer]
hamming = 1 : merge    (map (*2) hamming)
                 (merge (map (*3) hamming)
                        (map (*5) hamming))

main = print $ last $ take 1000000 hamming

519312780448388736089589843750000000000000000000000000000000000000000000000000000000

What do you think would be the corresponding idiomatic/fast D code? Does 
it outperform the Haskell code?

Jul 22 2014

"bearophile" <bearophileHUGS lycos.com> writes:

Timon Gehr:

 What do you think would be the corresponding idiomatic/fast D 
 code? Does it outperform the Haskell code?

Look in RosettaCode for various implementations in D:
http://rosettacode.org/wiki/Hamming_numbers#D

A good part of the performance difference is caused by the D GC 
and the fact that Haskell GHC uses a the GNU multiprecision libs 
for its bigsint. In the meantime we have to maintain, optimize 
and debug our bigint implementation, wasting time better spent 
elsewhere.

Bye,
bearophile

Jul 22 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 7/22/2014 2:29 PM, bearophile wrote:
 A good part of the performance difference is caused by the D GC and the fact
 that Haskell GHC uses a the GNU multiprecision libs for its bigsint. In the
 meantime we have to maintain, optimize and debug our bigint implementation,
 wasting time better spent elsewhere.


I.e. the perf difference has nothing to do with Haskell or tail recursion.

Jul 22 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 7/22/2014 2:08 PM, Timon Gehr wrote:
 And I'd love to see what native code is generated for the Haskell example.

 Well, the _actually_ corresponding D code using BigInt and DMD is significantly
 slower than what GHC generates here.

Irrelevant to the looping issue - comparing the code gen of D's BigInt to 
Haskell's BigInt is not the issue here.

Jul 22 2014

"Andrew Godfrey" <X y.com> writes:

On Sunday, 20 July 2014 at 06:06:16 UTC, bearophile wrote:
 Andrew Godfrey:

 2) Annotations about when a function does not expect 
 re-entrancy
 to be possible based on call-graph analysis.

 I don't understand. Assuming I know tAhis 
 (http://en.wikipedia.org/wiki/Reentrancy_%28computing%29 ) can 
 you show an example?

Sorry, by "re-entrancy" I just meant a statement
of whether it is possible for the function to be re-entered
(not whether it would be correct to do so).
Thanks for the reminder, so I can exclude the complication
of ISR's from my point.

In the codebase I work on - application code - there
are no ISR's, but a common cause of unexpected re-entrancy we 
have is:
functions that call a Windows API which pumps messages.
This produces unexpectedly deep call stacks and threatens stack 
overflow.

But that's a complicated case that may not be feasible
for the compiler to help with. So, a simpler example...


The general idea here is that you've convinced yourself you
need to do something which stresses usage of stack space,
and you believe it is safe to do because of *something*.
Ideally you can express that "something" so that
if anyone tries to change it later, the compiler catches that.


So, for example you've allocated a buffer on the stack,
and your function 'A' builds up data into it, then calls
another function 'X' to 'send' that data in some way.
Function 'A' definitely doesn't expect to be re-entered.
(It may or may not be re-entrant, in the definition of the wiki 
page.
Doesn't matter for this.)

Now the first step probably doesn't need language help:
call it "assertIAmNotReentered()". I think any solution probably
still needs this as a backup. I say "assert" (i.e. runtime check
that you'd remove in a release build) because I'm assuming
that if you use it a lot you wouldn't want to pay for all those
checks in a release build.


Now: Suppose 'X' is a Phobos function, and 'A' doesn't
call anything else. Then it seems feasible (but a lot of work)
to provide an annotation (or combination of annotations)
that mean:
     "Compiler, I am not called by an ISR. You don't have to verify
     that. But given that, I claim that I am not re-entered.
     Please verify."

This is a lot of work - and potentially error-prone - because
when Phobos calls the C runtime or OS API's, humans
would need to make decisions about whether those functions
'may callback'. For this reason I don't think we'd want
to eliminate "assertIAmNotReentered()".

By this process, Phobos functions which may call a 'callback' 
back into user code (whether directly or indirectly) would be 
distinguished
from ones which cannot (let's call it "NoCallback", since
I can't think of a non-terrible name right now).
And then the annotation above
can be enforced at least in those terms (i.e. the compiler
ensures that 'X' only calls "NoCallback"-annotated functions).


I think this is valuable, and not just to avoid stack overflow.
The message-pumping case often violates higher-level correctness.

Jul 20 2014

=?UTF-8?Q?Tobias=20M=C3=BCller?= <troplin bluewin.ch> writes:

"Andrew Godfrey" <X y.com> wrote:
 1) A function annotation that means "I will call myself
 recursively, and when I do, I expect the tail recursion
 optimization." I have seen code which allocates something big on
 the stack and depends on the optimization. So this intent should
 be expressible.

Wouldn't it be more useful to have a modified/annotated return statement
for that?

Tail-recursiveness is an implementation detail, for the user of the
function it's not really interesting. Except for the fact that it has
bounded stack size which is a useful property by itself and not only for
tailrecursive functions.

Tobi

Jul 21 2014

"bearophile" <bearophileHUGS lycos.com> writes:

Tobias Müller:

 Wouldn't it be more useful to have a modified/annotated return 
 statement for that?

I don't know.


 Tail-recursiveness is an implementation detail, for the user of 
 the function it's not really interesting.

Yes, in theory a  tailrec annotation doesn't change the mangling 
of the function. But D has exceptions and stack traces. A stack 
trace for an annotated function is different because there are 
less stack frames...


 Except for the fact that it has bounded stack size which is a 
 useful property by itself and not only for tailrecursive 
 functions.

Yes, that's why I have said that a  continuation is a more 
general solution than  tailrec.

Bye,
bearophile

Jul 21 2014

Timon Gehr <timon.gehr gmx.ch> writes:

On 07/22/2014 12:10 AM, bearophile wrote:
 Except for the fact that it has bounded stack size which is a useful
 property by itself and not only for tailrecursive functions.

 Yes, that's why I have said that a  continuation is a more general
 solution than  tailrec.

 Bye,
 bearophile

(Continuations and a tail calls are not the same kind of thing.)

http://en.wikipedia.org/wiki/Continuation#First-class_continuations
http://en.wikipedia.org/wiki/Delimited_continuation

http://en.wikipedia.org/wiki/Tail_call

Jul 21 2014

"Andrew Godfrey" <X y.com> writes:

On Monday, 21 July 2014 at 22:10:06 UTC, bearophile wrote:
 Tobias Müller:

 Wouldn't it be more useful to have a modified/annotated return 
 statement for that?

 I don't know.

I see what you're saying, and it works for the quick sort 
example. I'm not sure it covers all tailrec cases, but it seems 
useful.

Jul 22 2014

Iain Buclaw via Digitalmars-d <digitalmars-d puremagic.com> writes:

On 14 July 2014 11:07, via Digitalmars-d <digitalmars-d puremagic.com> wrote:
 On Sunday, 13 July 2014 at 23:35:46 UTC, Walter Bright wrote:
 On 7/13/2014 4:04 AM, "Marc Schütz" <schuetzm gmx.net>" wrote:
 On Sunday, 13 July 2014 at 03:25:08 UTC, Walter Bright wrote:
 On 7/11/2014 10:28 AM, deadalnix wrote:
 The compiler can ensure that you hit at least every 4k or so.


 And it already does.


 Doesn't look so:

     int bar(int a) {
         int[8000] b = void;
         b[$-1] = a;
         return b[$-1];
     }


 On Win32:

 _D4foo53barFiZi comdat
         assume  CS:_D4foo53barFiZi
                 push    EBP
                 mov     EBP,ESP
                 mov     EDX,7
 L8:             sub     ESP,01000h
                 test    [ESP],ESP
                 dec     EDX
                 jne     L8
                 sub     ESP,0D04h
                 lea     ECX,-8[EBP]
                 mov     [ECX],EAX
                 mov     EAX,-8[EBP]
                 leave
                 ret

 It doesn't do it on Linux because gcc doesn't do it. But the capability is
 in the back end, and it does it for alloca(), too.


 Hmm... but this using DMD, not GDC. Or do you mean that DMD doesn't do it,
 because GCC doesn't do it either? If so, what is the reason for this? Why
 shouldn't this feature be enabled on every platform?


For GDC, there is -fstack-protector (which is turned on by default in
distributed binaries from Ubuntu).  However IIRC only functions that
use alloca or have static arrays of type char are actually checked.
Declaring an int[100] doesn't invoke alloca, so you won't see it.

The bounds checking in D alone is enough to catch most common
overflowing stack bugs.

Regards
Iain

Jul 14 2014

"deadalnix" <deadalnix gmail.com> writes:

On Thursday, 26 June 2014 at 09:19:05 UTC, bearophile wrote:
 Walter Bright:

 It's an interesting list, and an opportunity for D. I once 
 said that my job was to put Coverity out of business. The more 
 of these issues D can automatically prevent with  safe, the 
 better.

 One kind of problem left is to avoid stack overflows. I have 
 had several such cases in my D code (perhaps because I use 
 fixed-sized arrays a lot).

 I think they can be caused by:
 1) Too much large allocations in stack frames;

If generated by the compiler, they be made  safe

 2) By alloca();

it is  system

 3) By recursion and co-recursion chains;

We should have a page reserved at the end of the stack so we can 
throw when reaching it. The compiler can ensure we don't bypass 
it in case 1.

Jun 26 2014

D Programming

C/C++ Programming

Other

digitalmars.D - Software Assurance Reference Dataset