www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Inherent code performance advantages of D over C?

reply Walter Bright <newshound2 digitalmars.com> writes:
"there is no way proper C code can be slower than those languages."

   -- 
http://www.reddit.com/r/programming/comments/1s5ze3/benchmarking_d_vs_go_vs_erlang_vs_c_for_mqtt/cduwwoy

comes up now and then. I think it's incorrect, D has many inherent advantages
in 
generating code over C:

1. D knows when data is immutable. C has to always make worst case assumptions, 
and assume indirectly accessed data mutates.

2. D knows when functions are pure. C has to make worst case assumptions.

3. Function inlining has generally been shown to be of tremendous value in 
optimization. D has access to all the source code in the program, or at least
as 
much as you're willing to show it, and can inline across modules. C cannot 
inline functions unless they appear in the same module or in .h files. It's a 
rare practice to push many functions into .h files. Of course, there are now 
linkers that can do whole program optimization for C, but those are kind of 
herculean efforts to work around that C limitation of being able to see only
one 
module at a time.

4. C strings are 0-terminated, D strings have a length property. The former has 
major negative performance consequences:

     a. lots of strlen()'s are necessary

     b. using substrings usually requires a malloc/copy/free sequence

5. CTFE can push a lot of computation to compile time rather than run time.
This 
has had spectacular positive performance consequences for things like regex. C 
has no CTFE ability.

6. D's array slicing coupled with GC means that many malloc/copy/free's
normally 
done in C are unnecessary in D.

7. D's "final switch" enables more efficient switch code generation, because
the 
default doesn't have to be considered.
Dec 06 2013
next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Friday, 6 December 2013 at 22:20:19 UTC, Walter Bright wrote:
 "there is no way proper C code can be slower than those 
 languages."

   -- 
 http://www.reddit.com/r/programming/comments/1s5ze3/benchmarking_d_vs_go_vs_erlang_vs_c_for_mqtt/cduwwoy

 comes up now and then. I think it's incorrect, D has many 
 inherent advantages in generating code over C:

 1. D knows when data is immutable. C has to always make worst 
 case assumptions, and assume indirectly accessed data mutates.

 2. D knows when functions are pure. C has to make worst case 
 assumptions.

 3. Function inlining has generally been shown to be of 
 tremendous value in optimization. D has access to all the 
 source code in the program, or at least as much as you're 
 willing to show it, and can inline across modules. C cannot 
 inline functions unless they appear in the same module or in .h 
 files. It's a rare practice to push many functions into .h 
 files. Of course, there are now linkers that can do whole 
 program optimization for C, but those are kind of herculean 
 efforts to work around that C limitation of being able to see 
 only one module at a time.

 4. C strings are 0-terminated, D strings have a length 
 property. The former has major negative performance 
 consequences:

     a. lots of strlen()'s are necessary

     b. using substrings usually requires a malloc/copy/free 
 sequence

 5. CTFE can push a lot of computation to compile time rather 
 than run time. This has had spectacular positive performance 
 consequences for things like regex. C has no CTFE ability.

 6. D's array slicing coupled with GC means that many 
 malloc/copy/free's normally done in C are unnecessary in D.

 7. D's "final switch" enables more efficient switch code 
 generation, because the default doesn't have to be considered.
You can add generic programming.
Dec 06 2013
prev sibling next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Walter Bright:

 comes up now and then. I think it's incorrect, D has many 
 inherent advantages in generating code over C:
I think in your list you have missed the point 8, that is templates allow for data specialization, or for specialization based on compile-time values. The common example of the first is the C sort() function compared to the type specialized one. An example for the second is code for the kD-tree that is specialized on the dimension (coordinate) to slice on: http://rosettacode.org/wiki/K-d_tree#D As you see the cyclic selection of the coordinate nextSplit is assigned to an enum: struct KdTree(size_t k, F) { KdNode!(k, F)* n; Orthotope!(k, F) bounds; // Constructs a KdTree from a list of points... this(Point!(k, F)[] pts, in Orthotope!(k, F) bounds_) pure { static KdNode!(k, F)* nk2(size_t split)(Point!(k, F)[] exset) pure { ... enum nextSplit = (split + 1) % d.length;//cycle coordinates
 2. D knows when functions are pure. C has to make worst case 
 assumptions.
Perhaps D purity were designed for usefulness, code correctness, etc. but not to help compilers. I remember some recent discussions in this newsgroup by developers of GDC that explained why the guarantees D offers over C can't lead to true improvements in the generated code. If this is true then perhaps D has some features that weren't designed in hindsight of what back-ends really need to optimize better. On this whole subject I remember that pointers in Fortan are regarded as so dis-empowered that the Fortran compiler is able to optimize their usage better than any pointers in usual C programs, even C99 programs that use the "restrict" keyword. There are also situations where D is slower than D: when D can't prove that an array will be accessed in bounds [*]. And when a D compiler because of separate compilation can't de-virtualize a virtual class method call. Bye, bearophile [*] I will have to say more on this topic in few days.
Dec 06 2013
next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
 Perhaps D purity were designed for usefulness,
I meant "was".
 There are also situations where D is slower than D:
I meant "than C" :-) Bye, bearophile
Dec 06 2013
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/6/2013 2:40 PM, bearophile wrote:
 I think in your list you have missed the point 8, that is templates allow for
 data specialization, or for specialization based on compile-time values.

 The common example of the first is the C sort() function compared to the type
 specialized one.
That's a good example.
 2. D knows when functions are pure. C has to make worst case assumptions.
Perhaps D purity were designed for usefulness, code correctness, etc. but not to help compilers. I remember some recent discussions in this newsgroup by developers of GDC that explained why the guarantees D offers over C can't lead to true improvements in the generated code. If this is true then perhaps D has some features that weren't designed in hindsight of what back-ends really need to optimize better.
dmd can and does remove multiple calls to strongly pure functions with the same arguments.
 There are also situations where D is slower than D: when D can't prove that an
 array will be accessed in bounds [*].
In the cases where D cannot, can C? Nope. C doesn't even know what an array is. Can any other language? Nope.
 And when a D compiler because of separate compilation can't de-virtualize a
virtual class method call.
Can C devirtualize function calls? Nope.
Dec 06 2013
next sibling parent reply "Maxim Fomin" <maxim maxim-fomin.ru> writes:
On Friday, 6 December 2013 at 22:52:46 UTC, Walter Bright wrote:
 On 12/6/2013 2:40 PM, bearophile wrote:
 I think in your list you have missed the point 8, that is 
 templates allow for
 data specialization, or for specialization based on 
 compile-time values.

 The common example of the first is the C sort() function 
 compared to the type
 specialized one.
That's a good example.
 2. D knows when functions are pure. C has to make worst case 
 assumptions.
Perhaps D purity were designed for usefulness, code correctness, etc. but not to help compilers. I remember some recent discussions in this newsgroup by developers of GDC that explained why the guarantees D offers over C can't lead to true improvements in the generated code. If this is true then perhaps D has some features that weren't designed in hindsight of what back-ends really need to optimize better.
dmd can and does remove multiple calls to strongly pure functions with the same arguments.
and what about holes in immutable, pure and rest type system?
 There are also situations where D is slower than D: when D 
 can't prove that an
 array will be accessed in bounds [*].
In the cases where D cannot, can C? Nope. C doesn't even know what an array is. Can any other language? Nope.
 And when a D compiler because of separate compilation can't 
 de-virtualize a virtual class method call.
Can C devirtualize function calls? Nope.
C doesn't have virtual functions. By the way, does D devirtualize them? AFAIK it doesn't either, but I do remember spec page was talking about it (this is so Dish - advertize optimization trick in spec and do not implement it).
Dec 06 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/6/2013 3:06 PM, Maxim Fomin wrote:
 and what about holes in immutable, pure and rest type system?
If there are bugs in the type system, then that optimization breaks.
 C doesn't have virtual functions.
Right, but you can (and people do) fake virtual functions with tables of function pointers. No, C doesn't devirtualize those.
 By the way, does D devirtualize them?
It does for classes/methods marked 'final' and also in cases where it can statically tell that a class instance is the most derived type.
Dec 06 2013
next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Walter Bright:

 It does for classes/methods marked 'final' and also in cases 
 where it can statically tell that a class instance is the most 
 derived type.
Recently I have seen this through Reddit (with a comment by Anon): http://eli.thegreenplace.net/2013/12/05/the-cost-of-dynamic-virtual-calls-vs-static-crtp-dispatch-in-c/ The JavaVM is often able to de-virtualize virtual calls. Regarding Java performance matters, from my experience another significant source of optimization in the JavaVM that is often overlooked is that the JavaVM is able to partially unroll even loops with a statically-unknown number of cycles. Currently I think GCC/DMD/LDC2 are not able or willing to do this. I think LLVM was trying to work on this problem a little. Bye, bearophile
Dec 06 2013
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/6/2013 3:40 PM, bearophile wrote:
 Recently I have seen this through Reddit (with a comment by Anon):

 http://eli.thegreenplace.net/2013/12/05/the-cost-of-dynamic-virtual-calls-vs-static-crtp-dispatch-in-c/

 The JavaVM is often able to de-virtualize virtual calls.
I know. It is an advantage that JITing has. It's also an advantage if you can do whole-program analysis, which can easily be done in Java.
Dec 06 2013
parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Fri, 06 Dec 2013 15:48:27 -0800
schrieb Walter Bright <newshound2 digitalmars.com>:

 On 12/6/2013 3:40 PM, bearophile wrote:
 Recently I have seen this through Reddit (with a comment by Anon):

 http://eli.thegreenplace.net/2013/12/05/the-cost-of-dynamic-virtual-calls-vs-static-crtp-dispatch-in-c/

 The JavaVM is often able to de-virtualize virtual calls.
I know. It is an advantage that JITing has. It's also an advantage if you can do whole-program analysis, which can easily be done in Java.
How is that easier in Java? When whole-program analysis finds that there is no class extending C, it could devirtualize all methods of C, but(!) you can load and unload new derived classes at runtime, too. Also the JVM doesn't load all classes at program startup, because it would create too much of a delay. This goes so far that there is even a special class for splash screens with minimal dependencies, to avoid loading most of the runtime and GUI library first. I think whole-program analysis in such an environment is outright impossible. -- Marco
Dec 07 2013
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/7/2013 1:30 AM, Marco Leise wrote:
 How is that easier in Java? When whole-program analysis finds
 that there is no class extending C, it could devirtualize all
 methods of C, but(!) you can load and unload new derived
 classes at runtime, too.
This can be done by noting what new derived classes are introduced by runtime loading, and re-JITing any functions that devirtualized base classes of it. I don't know if this is actually done, but I don't see an obvious problem with it.
Dec 07 2013
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 12/07/2013 10:03 AM, Walter Bright wrote:
 On 12/7/2013 1:30 AM, Marco Leise wrote:
 How is that easier in Java? When whole-program analysis finds
 that there is no class extending C, it could devirtualize all
 methods of C, but(!) you can load and unload new derived
 classes at runtime, too.
This can be done by noting what new derived classes are introduced by runtime loading, and re-JITing any functions that devirtualized base classes of it. I don't know if this is actually done, but I don't see an obvious problem with it.
It is actually done, eg. by the HotSpot JVM.
Dec 07 2013
parent Marco Leise <Marco.Leise gmx.de> writes:
Am Sat, 07 Dec 2013 10:34:53 +0100
schrieb Timon Gehr <timon.gehr gmx.ch>:

 On 12/07/2013 10:03 AM, Walter Bright wrote:
 On 12/7/2013 1:30 AM, Marco Leise wrote:
 How is that easier in Java? When whole-program analysis finds
 that there is no class extending C, it could devirtualize all
 methods of C, but(!) you can load and unload new derived
 classes at runtime, too.
This can be done by noting what new derived classes are introduced by runtime loading, and re-JITing any functions that devirtualized base classes of it. I don't know if this is actually done, but I don't see an obvious problem with it.
It is actually done, eg. by the HotSpot JVM.
Nice! I thought that the overhead might be considered excessive in tracking these details for a JIT optimization. -- Marco
Dec 07 2013
prev sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Saturday, 7 December 2013 at 08:31:08 UTC, Marco Leise wrote:
 How is that easier in Java? When whole-program analysis finds
 that there is no class extending C, it could devirtualize all
 methods of C, but(!) you can load and unload new derived
 classes at runtime, too.

 Also the JVM doesn't load all classes at program startup,
 because it would create too much of a delay. This goes so
 far that there is even a special class for splash screens with
 minimal dependencies, to avoid loading most of the runtime and
 GUI library first.

 I think whole-program analysis in such an environment is
 outright impossible.
You forgot that this is JITTed. The JVM can reemit the code for a function when assumption on its optimization do not hold. When the JVM load new classes, it invalidate a bunch of code. That mean that a function can be final, then virtual, then back to final, etc ..., during the program execution.
Dec 07 2013
prev sibling next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Dec 07, 2013 at 12:40:35AM +0100, bearophile wrote:
[...]
 Regarding Java performance matters, from my experience another
 significant source of optimization in the JavaVM that is often
 overlooked is that the JavaVM is able to partially unroll even loops
 with a statically-unknown number of cycles. Currently I think
 GCC/DMD/LDC2 are not able or willing to do this.
[...] Really? I've seen gcc/gdc unroll loops with unknown number of iterations, esp. when you're using -O3. It just unrolls into something like: loop_start: if (!loopCondition) goto end; loopBody(); if (!loopCondition) goto end; loopBody(); if (!loopCondition) goto end; loopBody(); if (!loopCondition) goto end; loopBody(); goto loop_start; end: ... I'm pretty sure I've seen gcc/gdc do this before. T -- Windows 95 was a joke, and Windows 98 was the punchline.
Dec 06 2013
next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Friday, 6 December 2013 at 23:59:46 UTC, H. S. Teoh wrote:
 On Sat, Dec 07, 2013 at 12:40:35AM +0100, bearophile wrote:
 [...]
 Regarding Java performance matters, from my experience another
 significant source of optimization in the JavaVM that is often
 overlooked is that the JavaVM is able to partially unroll even 
 loops
 with a statically-unknown number of cycles. Currently I think
 GCC/DMD/LDC2 are not able or willing to do this.
[...] Really? I've seen gcc/gdc unroll loops with unknown number of iterations, esp. when you're using -O3. It just unrolls into something like: loop_start: if (!loopCondition) goto end; loopBody(); if (!loopCondition) goto end; loopBody(); if (!loopCondition) goto end; loopBody(); if (!loopCondition) goto end; loopBody(); goto loop_start; end: ... I'm pretty sure I've seen gcc/gdc do this before. T
LLVM is also able to generate Duff's device on the fly for loops where it make sense.
Dec 06 2013
prev sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
H. S. Teoh:

 I've seen gcc/gdc unroll loops with unknown number of
 iterations, esp. when you're using -O3. It just unrolls into 
 something
 like:

 	loop_start:
 		if (!loopCondition) goto end;
 		loopBody();
 		if (!loopCondition) goto end;
 		loopBody();
 		if (!loopCondition) goto end;
 		loopBody();
 		if (!loopCondition) goto end;
 		loopBody();
 		goto loop_start;
 	end:	...

 I'm pretty sure I've seen gcc/gdc do this before.
deadalnix:
 LLVM is also able to generate Duff's device on the fly for 
 loops where it make sense.
I have not seen this optimization done on my code (both ldc2 and gcc), but I am glad to be wrong on this. The OracleVM uses a very different unrolling strategy: it splits the loop in two loops, the first loop has 2, 4 (or sometimes 8 times) unrolling and it doesn't contain tests beside one at the start and end, followed by a second normal (not unrolled) loop of the remaining n % 8 times. I was able to reach the same performance as Java using this strategy manually in D using ldc2. Bye, bearophile
Dec 06 2013
prev sibling parent =?UTF-8?B?U2ltZW4gS2rDpnLDpXM=?= <simen.kjaras gmail.com> writes:
On 07.12.2013 00:58, H. S. Teoh wrote:
 On Sat, Dec 07, 2013 at 12:40:35AM +0100, bearophile wrote:
 [...]
 Regarding Java performance matters, from my experience another
 significant source of optimization in the JavaVM that is often
 overlooked is that the JavaVM is able to partially unroll even loops
 with a statically-unknown number of cycles. Currently I think
 GCC/DMD/LDC2 are not able or willing to do this.
[...] Really? I've seen gcc/gdc unroll loops with unknown number of iterations, esp. when you're using -O3. It just unrolls into something like: loop_start: if (!loopCondition) goto end; loopBody(); if (!loopCondition) goto end; loopBody(); if (!loopCondition) goto end; loopBody(); if (!loopCondition) goto end; loopBody(); goto loop_start; end: ... I'm pretty sure I've seen gcc/gdc do this before.
The classic way of doing this (I have no idea if it's actually done by any compilers) is Duff's Device[1], which turns this code: do { *to++ = *from++; } while(--count > 0);} into this: int n = (count + 7) / 8; switch(count % 8) { case 0: do { *to++ = *from++; case 7: *to++ = *from++; case 6: *to++ = *from++; case 5: *to++ = *from++; case 4: *to++ = *from++; case 3: *to++ = *from++; case 2: *to++ = *from++; case 1: *to++ = *from++; } while(--n > 0); [1]: http://en.wikipedia.org/wiki/Duff's_device -- Simen
Dec 08 2013
prev sibling parent reply "Maxim Fomin" <maxim maxim-fomin.ru> writes:
On Friday, 6 December 2013 at 23:30:45 UTC, Walter Bright wrote:
 On 12/6/2013 3:06 PM, Maxim Fomin wrote:
 and what about holes in immutable, pure and rest type system?
If there are bugs in the type system, then that optimization breaks.
Bad news: there are many bugs in type system.
 C doesn't have virtual functions.
Right, but you can (and people do) fake virtual functions with tables of function pointers. No, C doesn't devirtualize those.
Neither does D.
 By the way, does D devirtualize them?
It does for classes/methods marked 'final'
this is essentially telling nothing, because these functions are not virtual. In your speaking, C 'devirtualizes' all direct calling.
 and also in cases where it can statically tell that a class 
 instance is the most derived type.
I haven't noticed that.
Dec 06 2013
parent =?UTF-8?B?U2ltZW4gS2rDpnLDpXM=?= <simen.kjaras gmail.com> writes:
On 07.12.2013 08:38, Maxim Fomin wrote:
 On Friday, 6 December 2013 at 23:30:45 UTC, Walter Bright wrote:
 On 12/6/2013 3:06 PM, Maxim Fomin wrote:
 and what about holes in immutable, pure and rest type system?
If there are bugs in the type system, then that optimization breaks.
Bad news: there are many bugs in type system.
 C doesn't have virtual functions.
Right, but you can (and people do) fake virtual functions with tables of function pointers. No, C doesn't devirtualize those.
Neither does D.
 By the way, does D devirtualize them?
It does for classes/methods marked 'final'
this is essentially telling nothing, because these functions are not virtual. In your speaking, C 'devirtualizes' all direct calling.
They're both virtual and not (sorta). Consider this case: class A { int foo() { return 3; } } class B : A { final int foo() { return 4; } } int bar(A a) { return a.foo(); } void baz(B b) { int n = bar(b); } If the compiler does not inline the call to bar in baz, a virtual call is performed. If it does inline it, then it knows that the function called will always be B.foo, even if the received value may be a subclass of B. -- Simen
Dec 08 2013
prev sibling next sibling parent reply Manu <turkeyman gmail.com> writes:
On 7 December 2013 08:52, Walter Bright <newshound2 digitalmars.com> wrote:

 On 12/6/2013 2:40 PM, bearophile wrote:

 And when a D compiler because of separate compilation can't de-virtualize
 a virtual class method call.
Can C devirtualize function calls? Nope.
Assuming a comparison to C++, you know perfectly well that D has a severe disadvantage. Unless people micro-manage final (I've never seen anyone do this to date), then classes will have significantly inferior performance to C++. C++ coders don't write virtual on everything. Especially not trivial accessors which must be inlined.
Dec 06 2013
next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Manu:

 Assuming a comparison to C++, you know perfectly well that D 
 has a severe
 disadvantage. Unless people micro-manage final (I've never seen 
 anyone do
 this to date), then classes will have significantly inferior 
 performance to C++.
Despite D has the two purities (currently they are three), const/immutable, and will hopefully have scope for function arguments, lot of D programmers will not add those annotations to D code (the D code I see in D.learn usually doesn't have those annotations), so the speed gains of D could be more theoretical than real. So const/immutable/static/scope/ safe should be the default for a modern language, for efficiency, safety, code understandability and testing. If you are a new D programmers, and the local variables in your function (including foreach loop variables) are immutable, you learn very quickly to add "mut" or "var" when you want to mutate them. And you will add that annotation only to the variables that you need to mutate. This avoids mutating variables by mistake, and mutating just copies by mistake as in a foreach on an array of structs. So this avoids some bugs. Bye, bearophile
Dec 06 2013
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/6/2013 4:40 PM, Manu wrote:
 Assuming a comparison to C++,
This is a comparison to C; a comparison to C++ is something else.
 you know perfectly well that D has a severe
 disadvantage. Unless people micro-manage final (I've never seen anyone do this
 to date), then classes will have significantly inferior performance to C++.
 C++ coders don't write virtual on everything. Especially not trivial accessors
 which must be inlined.
I know well that people used to C++ will likely do this. However, one can get in the habit of by default adding "final:" as the first line in a class definition, and then the compiler will tell you which ones need to be made virtual.
Dec 06 2013
parent reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 07/12/13 02:10, Walter Bright wrote:
 On 12/6/2013 4:40 PM, Manu wrote:
 you know perfectly well that D has a severe
 disadvantage. Unless people micro-manage final (I've never seen anyone do this
 to date), then classes will have significantly inferior performance to C++.
 C++ coders don't write virtual on everything. Especially not trivial accessors
 which must be inlined.
I know well that people used to C++ will likely do this. However, one can get in the habit of by default adding "final:" as the first line in a class definition, and then the compiler will tell you which ones need to be made virtual.
The disadvantage of this approach is that, if one forgets to add that "final", it doesn't just produce a performance hit -- it means that it may be impossible to correct without breaking downstream code, because users may have overridden class methods that weren't meant to be virtual. By contrast, if you have final-by-default and accidentally leave the "virtual" keyword off a class or method, that can be fixed without hurting anyone.
Dec 07 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/7/2013 1:52 AM, Joseph Rushton Wakeling wrote:
 On 07/12/13 02:10, Walter Bright wrote:
 I know well that people used to C++ will likely do this. However, one can get
in
 the habit of by default adding "final:" as the first line in a class
definition,
 and then the compiler will tell you which ones need to be made virtual.
The disadvantage of this approach is that, if one forgets to add that "final", it doesn't just produce a performance hit -- it means that it may be impossible to correct without breaking downstream code, because users may have overridden class methods that weren't meant to be virtual.
D doesn't allow overriding non-virtual functions (unlike C++).
Dec 07 2013
next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 07/12/13 18:29, Walter Bright wrote:
 D doesn't allow overriding non-virtual functions (unlike C++).
I'm speaking of the case where what you mean to write is, final class Foo { /* ... lots of methods which are all final, because * the class as a whole is final */ } but what you actually write is, class Foo // Whoops! Forgot the final { /* ... lots of methods which, because of the missing * "final", are instead virtual and can be overridden */ } In this case you can't fix the original error -- by adding "final" before the class declaration -- without risking breaking downstream code, because someone may have created a subclass of Foo that overrides one or more of its methods.
Dec 07 2013
prev sibling next sibling parent Manu <turkeyman gmail.com> writes:
On 8 December 2013 03:29, Walter Bright <newshound2 digitalmars.com> wrote:

 On 12/7/2013 1:52 AM, Joseph Rushton Wakeling wrote:

 On 07/12/13 02:10, Walter Bright wrote:

 I know well that people used to C++ will likely do this. However, one
 can get in
 the habit of by default adding "final:" as the first line in a class
 definition,
 and then the compiler will tell you which ones need to be made virtual.
The disadvantage of this approach is that, if one forgets to add that "final", it doesn't just produce a performance hit -- it means that it may be impossible to correct without breaking downstream code, because users may have overridden class methods that weren't meant to be virtual.
D doesn't allow overriding non-virtual functions (unlike C++).
But, that's irrelevant, because if they did forget 'final' as suggested, then everything is virtual, so your point has no foundation. There is overwhelming (almost total) evidence that people barely use final, either due to inexperience/ignorance, indifference, or forgetfulness. I suspect those criteria probably cover close to 100% of the workforce. It can't be the default state, there are no syntactic safeguards. You've agreed on that in the past. Have you doubled back, or do you still agree? People who make the sort of _conscious choice_ for their OOP library that it should be the sort of java-like library where everything is overridable, can easily type 'virtual:' at the top, and make their intent explicit. There's no such simplicity with final, because unlike the 'everything is virtual' case, where 'virtual:' is easily applicable and the compiler _will produce an error message_ if they forget, there is no such useful concept 'everything is final', only 'most things are final', which means final must always be micromanaged; 'final:' can't easily be used like 'virtual:' can. And regardless, it remains prone to the risks in my second paragraph.
Dec 07 2013
prev sibling parent Stewart Gordon <smjg_1998 yahoo.com> writes:
On 07/12/2013 17:29, Walter Bright wrote:
 On 12/7/2013 1:52 AM, Joseph Rushton Wakeling wrote:
 On 07/12/13 02:10, Walter Bright wrote:
 I know well that people used to C++ will likely do this. However, one can get
in
 the habit of by default adding "final:" as the first line in a class
definition,
 and then the compiler will tell you which ones need to be made virtual.
The disadvantage of this approach is that, if one forgets to add that "final", it doesn't just produce a performance hit -- it means that it may be impossible to correct without breaking downstream code, because users may have overridden class methods that weren't meant to be virtual.
D doesn't allow overriding non-virtual functions (unlike C++).
How do you _override_ a non-virtual function in C++, as opposed to define an independent function with the same name in a derived class? Stewart. -- My email address is valid but not my primary mailbox and not checked regularly. Please keep replies on the 'group where everybody may benefit.
Feb 01 2014
prev sibling next sibling parent "Rob T" <alanb ucora.com> writes:
On Saturday, 7 December 2013 at 00:40:52 UTC, Manu wrote:
 Assuming a comparison to C++, you know perfectly well that D 
 has a severe
 disadvantage. Unless people micro-manage final (I've never seen 
 anyone do
 this to date), then classes will have significantly inferior 
 performance to
 C++.
 C++ coders don't write virtual on everything. Especially not 
 trivial
 accessors which must be inlined.
Yes, but this change will resolve that problem, and I believe it has been approved, correct? Issue 11616 - Introduce virtual keyword and remove virtual-by-default https://d.puremagic.com/issues/show_bug.cgi?id=11616 --rt
Dec 06 2013
prev sibling next sibling parent "ponce" <contact gam3sfrommars.fr> writes:
On Saturday, 7 December 2013 at 00:40:52 UTC, Manu wrote:
 Assuming a comparison to C++, you know perfectly well that D 
 has a severe
 disadvantage. Unless people micro-manage final (I've never seen 
 anyone do
 this to date), then classes will have significantly inferior 
 performance to
 C++.
 C++ coders don't write virtual on everything. Especially not 
 trivial
 accessors which must be inlined.
I concur with Manu and if D gain more adoption we can only expect calls that should not be virtual be virtual. Removing unecessary virtual calls in a C++ codebase gives significant performance improvements in my experience. But it's not even so much about virtual calls being slower than the myth that every function being redefinable in a sub-class _by default_ is somehow a good thing. I don't think it is at all.
Dec 07 2013
prev sibling parent "Max Samukha" <maxsamukha gmail.com> writes:
On Saturday, 7 December 2013 at 00:40:52 UTC, Manu wrote:

 Assuming a comparison to C++, you know perfectly well that D 
 has a severe
 disadvantage. Unless people micro-manage final (I've never seen 
 anyone do
 this to date),
I do. Whether a function should be virtual is a design decision that needs to be made form the outset. Having all class functions freely overridable is not a good idea.
 then classes will have significantly inferior performance to
 C++.
 C++ coders don't write virtual on everything. Especially not 
 trivial
 accessors which must be inlined.
Dec 07 2013
prev sibling parent =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 12/06/2013 02:52 PM, Walter Bright wrote:

 On 12/6/2013 2:40 PM, bearophile wrote:
 I think in your list you have missed the point 8, that is templates
 allow for
 data specialization, or for specialization based on compile-time values.

 The common example of the first is the C sort() function compared to
 the type
 specialized one.
That's a good example.
Bjarne Stroustrup has the article "Learning Standard C++ as a New Language" where he demonstrates bearophile's point, as well as how C++ is a better language than C for novices: http://www.stroustrup.com/new_learning.pdf Ali
Dec 06 2013
prev sibling parent reply Paulo Pinto <pjmlp progtools.org> writes:
Am 06.12.2013 23:40, schrieb bearophile:
 Walter Bright:

 comes up now and then. I think it's incorrect, D has many inherent
 advantages in generating code over C:
I think in your list you have missed the point 8, that is templates allow for data specialization, or for specialization based on compile-time values. The common example of the first is the C sort() function compared to the type specialized one. An example for the second is code for the kD-tree that is specialized on the dimension (coordinate) to slice on: http://rosettacode.org/wiki/K-d_tree#D As you see the cyclic selection of the coordinate nextSplit is assigned to an enum: struct KdTree(size_t k, F) { KdNode!(k, F)* n; Orthotope!(k, F) bounds; // Constructs a KdTree from a list of points... this(Point!(k, F)[] pts, in Orthotope!(k, F) bounds_) pure { static KdNode!(k, F)* nk2(size_t split)(Point!(k, F)[] exset) pure { ... enum nextSplit = (split + 1) % d.length;//cycle coordinates
 2. D knows when functions are pure. C has to make worst case assumptions.
Perhaps D purity were designed for usefulness, code correctness, etc. but not to help compilers. I remember some recent discussions in this newsgroup by developers of GDC that explained why the guarantees D offers over C can't lead to true improvements in the generated code. If this is true then perhaps D has some features that weren't designed in hindsight of what back-ends really need to optimize better. On this whole subject I remember that pointers in Fortan are regarded as so dis-empowered that the Fortran compiler is able to optimize their usage better than any pointers in usual C programs, even C99 programs that use the "restrict" keyword. There are also situations where D is slower than D: when D can't prove that an array will be accessed in bounds [*]. And when a D compiler because of separate compilation can't de-virtualize a virtual class method call. Bye, bearophile [*] I will have to say more on this topic in few days.
That is why most safe systems programming language compilers allow disabling bounds checking. :) Back in the MS-DOS days, I made use of {$R-} sections if I really needed the few ms gained by disabling bounds checking in Turbo Pascal. -- Paulo
Dec 06 2013
parent "bearophile" <bearophileHUGS lycos.com> writes:
Paulo Pinto:

 That is why most safe systems programming language compilers 
 allow disabling bounds checking. :)
Disabling bounds checking (BC) is an admission of defeat (or just of practicality over technical refinement). Various languages approach the situation in different ways, some examples: - Python has BC, and it can't be disabled. - Java has BC, but with the large engineering efforts done on the OracleVM, some array accesses are proved to be in-bound, and removed (some BC is removed thanks to inlining). The effect of this optimization is visible for matrix-processing code, etc. (It's not a large effect, but it's useful, and it could be handy to have in D too). - D has BC, but you can disable it for each compilation unit. - Ada is like D, but its stronger typing (and the strongly typed style most Ada programs are written! because it's also a matter of how you use a language) allows the compiler to optimize away some BC with very simple means, without the need for the analysis done by the OracleVM. Bye, bearophile
Dec 06 2013
prev sibling next sibling parent reply "Maxim Fomin" <maxim maxim-fomin.ru> writes:
On Friday, 6 December 2013 at 22:20:19 UTC, Walter Bright wrote:
 "there is no way proper C code can be slower than those 
 languages."

   -- 
 http://www.reddit.com/r/programming/comments/1s5ze3/benchmarking_d_vs_go_vs_erlang_vs_c_for_mqtt/cduwwoy

 comes up now and then. I think it's incorrect, D has many 
 inherent advantages in generating code over C:
What surprises me most is claim that D can 'hypothetically' generate more efficient code comparing with C, especially taking into account current situation with code generation and optimization. The claim about inherent advantages implies that code generation is not good now. If it can be so efficient, why it is not the case? And if in practice it is worse, than who cares that 'in theory' code can be better? I believe that most of your points are either insignificant (like array length - it is carried together with pointer almost everywhere in C) or provide some marginal advantage. Such advantages are offset by: - huge runtime library - constant runtime lib invocation and allocation stuff on heap - horrible mangling (see http://forum.dlang.org/thread/mailman.207.1369611513.13711.digital ars-d puremagic.com examples from hall of D mangling, mangling is so big, that forum software goes astray) - phobos snowball - one invocation of some function in standard library leads to dozens template instantiations and invocations of pretty much stuff
Dec 06 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/6/2013 3:02 PM, Maxim Fomin wrote:
 What surprises me most is claim that D can 'hypothetically' generate more
 efficient code comparing with C, especially taking into account current
 situation with code generation and optimization.

 The claim about inherent advantages implies that code generation is not good
 now. If it can be so efficient, why it is not the case? And if in practice it
is
 worse, than who cares that 'in theory' code can be better?
You can write D code in "C style" and you'll get C results. To get performance advantages from D code, you'll need to write in a structurally different way (as Andrei pointed out). Looking through Phobos, there is a lot of code that is not written to take advantage of D's strengths. An apt one discussed here recently is the std.path.buildPath, which is written in "C style", as in allocating memory for its result. A structural D style version would accept a range for its output, and the range need not allocate memory. This would be fundamentally faster than the typical C approach. This pattern is repeated a lot in Phobos code.
 I believe that most of your points are either insignificant (like array length
-
 it is carried together with pointer almost everywhere in C)
I see a lot of C code that does strlen() over and over. I think Tango's XML parser showed what can be done in D versus any known C implementation. It took maximal advantage of D's slicing abilities to avoid copying. Dmitry's regex also showed huge gains over C regex implementations.
 or provide some marginal advantage.
Even a marginal advantage is a counter example to the claim "there is no way proper C code can be slower than those languages."
 Such advantages are offset by:

 - huge runtime library
C has a huge runtime library, too, it's just that you normally don't notice it because it's not statically linked in. Be that as it may, 2.064 substantially reduced the size of "hello world" programs.
 - constant runtime lib invocation and allocation stuff on heap
This is, as I mentioned, a problem with writing C style code in Phobos.
 - horrible mangling (see
 http://forum.dlang.org/thread/mailman.207.1369611513.13711.digitalmars-d puremagic.com
 examples from hall of D mangling, mangling is so big, that forum software goes
 astray)
Long mangling is not an inherent language characteristic, as that thread suggests improvements.
 - phobos snowball - one invocation of some function in standard library leads
to
 dozens template instantiations and invocations of pretty much stuff
True enough, but does that lead to non-performant code? 2.064 cuts that down quite a bit anyway, and I think we can make more improvements in this regard.
Dec 06 2013
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Dec 06, 2013 at 03:19:24PM -0800, Walter Bright wrote:
 On 12/6/2013 3:02 PM, Maxim Fomin wrote:
[...]
Such advantages are offset by:

- huge runtime library
C has a huge runtime library, too, it's just that you normally don't notice it because it's not statically linked in. Be that as it may, 2.064 substantially reduced the size of "hello world" programs.
Are there any upcoming further improvements in this area? It would be nice to not have multi-MB programs that only print "hello world". :) (Not that that's any meaningful indicator of typical program size, since typically programs do more than just print "hello world", but still, I think there's still lots of low-hanging fruit here.) [...]
- phobos snowball - one invocation of some function in standard
library leads to dozens template instantiations and invocations of
pretty much stuff
True enough, but does that lead to non-performant code? 2.064 cuts that down quite a bit anyway, and I think we can make more improvements in this regard.
It would be nice to decouple Phobos modules more. A *lot* more. Currently there is a rather nasty tangle of mutual imports between several large modules (e.g., std.stdio, std.format, std.algorithm, std.conv, and a few others). Import just one of them, and it pulls in *everything*. This goes against the Phobos philosophy as advertised on dlang.org -- that dependencies between modules should be minimal. One low-hanging fruit that comes to mind is to use local imports instead of module-wide imports. If the local imports are inside templated functions, I *think* it would prevent pulling in the imports until the function is actually used, which would have the desired effect. (Right?) Much of Phobos was written before we had this feature, but since we have it now, might as well make good use of it. T -- "A one-question geek test. If you get the joke, you're a geek: Seen on a California license plate on a VW Beetle: 'FEATURE'..." -- Joshua D. Wachs - Natural Intelligence, Inc.
Dec 06 2013
next sibling parent reply "John Colvin" <john.loughran.colvin gmail.com> writes:
On Friday, 6 December 2013 at 23:56:39 UTC, H. S. Teoh wrote:
 It would be nice to decouple Phobos modules more. A *lot* more.
Why? I've seen this point made several times and I can't understand why this is an important concern. I see the interplay between phobos modules as good, it saves reinventing the wheel all over the place, making for a smaller, cleaner standard library. Am I missing something fundamental here?
Dec 06 2013
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Dec 07, 2013 at 01:09:00AM +0100, John Colvin wrote:
 On Friday, 6 December 2013 at 23:56:39 UTC, H. S. Teoh wrote:
It would be nice to decouple Phobos modules more. A *lot* more.
Why? I've seen this point made several times and I can't understand why this is an important concern. I see the interplay between phobos modules as good, it saves reinventing the wheel all over the place, making for a smaller, cleaner standard library. Am I missing something fundamental here?
It's not that it's bad to reuse code. The problem is the dependency is too coarse-grained, so that if you want to, say, print "hello world", it pulls in all sorts of stuff, like algorithms for sorting arrays (just an example, not the actual case), or floating-point format parsers (may actually be the case), which aren't *needed* to perform that particular task. If printing "hello world" requires pulling in file locking code, then by all means, pull that in. But it shouldn't pull in, say, std.complex just because some obscure corner of writeln's implementation makes a reference to std.complex. T -- People tell me I'm stubborn, but I refuse to accept it!
Dec 06 2013
next sibling parent reply "qznc" <qznc web.de> writes:
On Saturday, 7 December 2013 at 00:26:34 UTC, H. S. Teoh wrote:
 On Sat, Dec 07, 2013 at 01:09:00AM +0100, John Colvin wrote:
 On Friday, 6 December 2013 at 23:56:39 UTC, H. S. Teoh wrote:
It would be nice to decouple Phobos modules more. A *lot* 
more.
Why? I've seen this point made several times and I can't understand why this is an important concern. I see the interplay between phobos modules as good, it saves reinventing the wheel all over the place, making for a smaller, cleaner standard library. Am I missing something fundamental here?
It's not that it's bad to reuse code. The problem is the dependency is too coarse-grained, so that if you want to, say, print "hello world", it pulls in all sorts of stuff, like algorithms for sorting arrays (just an example, not the actual case), or floating-point format parsers (may actually be the case), which aren't *needed* to perform that particular task. If printing "hello world" requires pulling in file locking code, then by all means, pull that in. But it shouldn't pull in, say, std.complex just because some obscure corner of writeln's implementation makes a reference to std.complex.
What is the actual problem? Compile times? Binary size? Surely not performance or efficency. I remember someone from the Go team (maybe Pike), that they have deliberate code duplication in the standard library to decouple it. I did not understand the reasoning there, too.
Dec 07 2013
parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
07-Dec-2013 12:07, qznc пишет:
 On Saturday, 7 December 2013 at 00:26:34 UTC, H. S. Teoh wrote:
 On Sat, Dec 07, 2013 at 01:09:00AM +0100, John Colvin wrote:
What is the actual problem? Compile times? Binary size? Surely not performance or efficency.
Binary size and compile times. Since the end result doesn't penalize end-user in any other way it's a win across the board. And you can always import whole packages.
 I remember someone from the Go team (maybe Pike), that they have
 deliberate code duplication in the standard library to decouple it. I
 did not understand the reasoning there, too.
That is just one way to it. Most of the time "the right thing" is to factor things into smaller reusable parts. Unlike Go we have ways to write very flexible code once and have it run at top speed in all cases. -- Dmitry Olshansky
Dec 07 2013
prev sibling parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Saturday, 7 December 2013 at 00:26:34 UTC, H. S. Teoh wrote:
 On Sat, Dec 07, 2013 at 01:09:00AM +0100, John Colvin wrote:
 On Friday, 6 December 2013 at 23:56:39 UTC, H. S. Teoh wrote:
It would be nice to decouple Phobos modules more. A *lot* 
more.
Why? I've seen this point made several times and I can't understand why this is an important concern. I see the interplay between phobos modules as good, it saves reinventing the wheel all over the place, making for a smaller, cleaner standard library. Am I missing something fundamental here?
It's not that it's bad to reuse code. The problem is the dependency is too coarse-grained, so that if you want to, say, print "hello world", it pulls in all sorts of stuff, like algorithms for sorting arrays (just an example, not the actual case), or floating-point format parsers (may actually be the case), which aren't *needed* to perform that particular task. If printing "hello world" requires pulling in file locking code, then by all means, pull that in. But it shouldn't pull in, say, std.complex just because some obscure corner of writeln's implementation makes a reference to std.complex. T
Ok, so that describes what over-dependency is, but not why it's a problem we should care about.
Dec 07 2013
prev sibling parent reply "Jason den Dulk" <public2 jasondendulk.com> writes:
On Saturday, 7 December 2013 at 00:09:01 UTC, John Colvin wrote:
 On Friday, 6 December 2013 at 23:56:39 UTC, H. S. Teoh wrote:
 It would be nice to decouple Phobos modules more. A *lot* more.
Why? I've seen this point made several times and I can't understand why this is an important concern. I see the interplay between phobos modules as good, it saves reinventing the wheel all over the place, making for a smaller, cleaner standard library. Am I missing something fundamental here?
On the introduction page of the Phobos documentation, as part of it's philosophy, it states "Classes should strive to be independent of one another It's discouraging to pull in a megabyte of code bloat by just trying to read a file into an array of bytes. Class independence also means that classes that turn out to be mistakes can be deprecated and redesigned without forcing a rewrite of the rest of the class library." (This can also apply to functions, templates and modules). Currently, Phobos does exactly that. It pulls in a lot of bloat to perform trivial tasks, and it is discouraging. More importantly it is difficult to isolate any part of Phobos. When trying to avoid any part of Phobos because of bugginess or inefficiency, I find it next to impossible because chances are, it will be used by some other part of Phobos. I am speculating here, but I imagine that maintaining and debugging Phobos must be a nightmare. Can anybody speak from experience on this? One think I have discovered is that Phobos introduces "junk code" into executables. One time I did an experiment. I copied the bits of Phobos that my program used into a separate file and imported that instead of the Phobos modules. The resultant executable was half the size (using -release, -inline, -O and "strip" in both cases). For some reason, Phobos was adding over 250KB of junk code that strip could not get rid of. Regards Jason
Dec 07 2013
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sun, Dec 08, 2013 at 05:19:53AM +0100, Jason den Dulk wrote:
[...]
 I am speculating here, but I imagine that maintaining and debugging
 Phobos must be a nightmare. Can anybody speak from experience on
 this?
Actually, while Phobos does have its warts, it's surprisingly pleasant to read and maintain, thanks to the readability features of D. I've found it to be easy to read, and mostly easy to understand. There are some ugly bits here and there, of course, but compared to, say, glibc, it's extremely readable for a standard library. (And FWIW, I'm only a part-time Phobos volunteer, so I'm saying this not because I've an agenda to defend Phobos, but because I genuinely find it a pleasant surprise compared to most standard libraries of other languages that I've seen.)
 One think I have discovered is that Phobos introduces "junk code"
 into executables. One time I did an experiment. I copied the bits of
 Phobos that my program used into a separate file and imported that
 instead of the Phobos modules. The resultant executable was half the
 size (using -release, -inline, -O and "strip" in both cases). For
 some reason, Phobos was adding over 250KB of junk code that strip
 could not get rid of.
[...] Yeah, this part bothers me too. Once I hacked up a script (well, a little D program :P) that disassembles D executables and builds a reference graph of its symbols. I ran this on a few small test programs, and was quite dismayed to discover that the mere act of importing std.stdio (for calling writeln("Hello World");) will introduce symbols from std.complex into your executable, even though the program has nothing to do with complex numbers. These symbols are never referenced from main() (i.e., the reference graph of the std.complex symbols are disjoint from the subgraph that contains _Dmain), yet they are included in the executable. That's why I said that Phobos has a ways to go in terms of modularity and dependency management. Just because std.complex is used by *some* obscure bit of code in std.stdio, doesn't mean that it should get pulled in just because you want to print Hello World. The compiler could also be a bit smarter about which symbols it emits code for, eliding those that are never actually referenced in the program. While my overall D experience has been quite positive, this is one of the things that I found disappointing. T -- To err is human; to forgive is not our policy. -- Samuel Adler
Dec 07 2013
parent reply "John Colvin" <john.loughran.colvin gmail.com> writes:
On Sunday, 8 December 2013 at 05:27:12 UTC, H. S. Teoh wrote:
 Once I hacked up a script  (well, a little D program :P) that
 disassembles D executables and builds a reference graph of
 its symbols.
Do you still have that somewhere? I've never attempted such a thing and would like to see what it entailed.
Dec 08 2013
next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sun, Dec 08, 2013 at 10:56:09AM +0100, John Colvin wrote:
 On Sunday, 8 December 2013 at 05:27:12 UTC, H. S. Teoh wrote:
Once I hacked up a script  (well, a little D program :P) that
disassembles D executables and builds a reference graph of
its symbols.
Do you still have that somewhere? I've never attempted such a thing and would like to see what it entailed.
Well, all it does is to run objdump to disassemble the executable, then parse the output to extract (1) symbols (usually functions) and (2) references to other symbols (within the function body), and construct a graph out of that. It's not 100% accurate, though, because objdump doesn't know anything about vtables or jump tables, so any references through those would not be found. Also, it will obviously produce useless results if your executable has been stripped of symbols. I'll put the code up on github if you're interested. And, looking at the code again, it appears that I was in the middle of a rewrite (doesn't even compile yet), which is in a separate git branch; but master seems usable enough so I'll just push that. T -- Skill without imagination is craftsmanship and gives us many useful objects such as wickerwork picnic baskets. Imagination without skill gives us modern art. -- Tom Stoppard
Dec 09 2013
prev sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sun, Dec 08, 2013 at 10:56:09AM +0100, John Colvin wrote:
 On Sunday, 8 December 2013 at 05:27:12 UTC, H. S. Teoh wrote:
Once I hacked up a script  (well, a little D program :P) that
disassembles D executables and builds a reference graph of
its symbols.
Do you still have that somewhere? I've never attempted such a thing and would like to see what it entailed.
Heh, apparently I already have the code up on github: https://github.com/quickfur/symdep T -- It won't be covered in the book. The source code has to be useful for something, after all. -- Larry Wall
Dec 09 2013
prev sibling next sibling parent Marco Leise <Marco.Leise gmx.de> writes:
Am Sun, 08 Dec 2013 05:19:53 +0100
schrieb "Jason den Dulk" <public2 jasondendulk.com>:

 One think I have discovered is that Phobos introduces "junk code" 
 into executables. One time I did an experiment. I copied the bits 
 of Phobos that my program used into a separate file and imported 
 that instead of the Phobos modules. The resultant executable was 
 half the size (using -release, -inline, -O and "strip" in both 
 cases). For some reason, Phobos was adding over 250KB of junk 
 code that strip could not get rid of.
 
 Regards
 Jason
Strip doesn't remove dead code, but sections in the executable that aren't required for running the program, like symbol names or debugging information. That said all code is merged into a single .text section by the linker and cannot be tangled by strip at all. To remove unreferenced functions, use: gdc -ffunction-sections -Wl,--gc-sections That will create a single section for every function (instead of one section per module as far as I understand it) and tell the linker to remove any section that is not referenced. -- Marco
Dec 08 2013
prev sibling parent reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 08/12/13 06:25, H. S. Teoh wrote:
 Yeah, this part bothers me too. Once I hacked up a script (well, a
 little D program :P) that disassembles D executables and builds a
 reference graph of its symbols. I ran this on a few small test programs,
 and was quite dismayed to discover that the mere act of importing
 std.stdio (for calling writeln("Hello World");) will introduce symbols
 from std.complex into your executable, even though the program has
 nothing to do with complex numbers. These symbols are never referenced
 from main() (i.e., the reference graph of the std.complex symbols are
 disjoint from the subgraph that contains _Dmain), yet they are included
 in the executable.
Do you have any idea why the std.complex symbols were pulled in, i.e. what dependencies were responsible? The only module that I'm aware of that imports std.complex is std.numeric, which is itself only imported by std.parallelism and std.random. Are you sure it's not just the whole of Phobos being built in statically because you don't strip the binary?
Dec 08 2013
parent reply "John Colvin" <john.loughran.colvin gmail.com> writes:
On Sunday, 8 December 2013 at 10:11:20 UTC, Joseph Rushton 
Wakeling wrote:
 On 08/12/13 06:25, H. S. Teoh wrote:
 Yeah, this part bothers me too. Once I hacked up a script 
 (well, a
 little D program :P) that disassembles D executables and 
 builds a
 reference graph of its symbols. I ran this on a few small test 
 programs,
 and was quite dismayed to discover that the mere act of 
 importing
 std.stdio (for calling writeln("Hello World");) will introduce 
 symbols
 from std.complex into your executable, even though the program 
 has
 nothing to do with complex numbers. These symbols are never 
 referenced
 from main() (i.e., the reference graph of the std.complex 
 symbols are
 disjoint from the subgraph that contains _Dmain), yet they are 
 included
 in the executable.
Do you have any idea why the std.complex symbols were pulled in, i.e. what dependencies were responsible? The only module that I'm aware of that imports std.complex is std.numeric, which is itself only imported by std.parallelism and std.random. Are you sure it's not just the whole of Phobos being built in statically because you don't strip the binary?
std.stdio -> std.algorithm -> std.random -> std.numeric -> std.complex.
Dec 08 2013
next sibling parent reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 08/12/13 11:24, John Colvin wrote:
 std.stdio -> std.algorithm -> std.random -> std.numeric -> std.complex.
I'd forgotten that std.algorithm pulled in std.random. Glancing through, I'm not sure it uses it apart from for unittests? So it might be possible to strip out the dependency ... I'll have a look this afternoon. This could be a useful lint tool to have, checking for imports that are only used by unittest blocks.
Dec 08 2013
parent reply "John Colvin" <john.loughran.colvin gmail.com> writes:
On Sunday, 8 December 2013 at 10:31:49 UTC, Joseph Rushton 
Wakeling wrote:
 On 08/12/13 11:24, John Colvin wrote:
 std.stdio -> std.algorithm -> std.random -> std.numeric -> 
 std.complex.
I'd forgotten that std.algorithm pulled in std.random. Glancing through, I'm not sure it uses it apart from for unittests? So it might be possible to strip out the dependency ... I'll have a look this afternoon. This could be a useful lint tool to have, checking for imports that are only used by unittest blocks.
This was just from a quick grepping session. I'm sure there are other paths from std.stdio to std.complex. You should run DGraph on it :p
Dec 08 2013
parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 08/12/13 11:34, John Colvin wrote:
 This was just from a quick grepping session. I'm sure there are other paths
from
 std.stdio to std.complex. You should run DGraph on it :p
Nice thought, must get round to it :-)
Dec 08 2013
prev sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 08/12/13 11:31, Joseph Rushton Wakeling wrote:
 I'd forgotten that std.algorithm pulled in std.random.  Glancing through, I'm
 not sure it uses it apart from for unittests?
On closer look, it's used for std.algorithm.topN. I guess it could be relegated to being imported inside that function (and appropriate unittest blocks), but that does justify it being a top-level import.
Dec 08 2013
prev sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
07-Dec-2013 03:55, H. S. Teoh пишет:
 On Fri, Dec 06, 2013 at 03:19:24PM -0800, Walter Bright wrote:
 On 12/6/2013 3:02 PM, Maxim Fomin wrote:
 - phobos snowball - one invocation of some function in standard
 library leads to dozens template instantiations and invocations of
 pretty much stuff
 One low-hanging fruit that comes to mind is to use local imports instead
 of module-wide imports. If the local imports are inside templated
 functions, I *think* it would prevent pulling in the imports until the
 function is actually used, which would have the desired effect. (Right?)
 Much of Phobos was written before we had this feature, but since we have
 it now, might as well make good use of it.
A major point is to decouple feather-weight "traits" part of modules and the API part of module (preferably also split by category). Then a given Phobos module may do something like this: import std.regex.traits; auto dirEntries(C, RegEx)(in C[] path, RegEx re) if(isSomeChar!C && isRegexFor!(Regex, C)) { import std.regex; //full package ... } -- Dmitry Olshansky
Dec 06 2013
parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
07-Dec-2013 11:15, Dmitry Olshansky пишет:
 07-Dec-2013 03:55, H. S. Teoh пишет:
 On Fri, Dec 06, 2013 at 03:19:24PM -0800, Walter Bright wrote:
 On 12/6/2013 3:02 PM, Maxim Fomin wrote:
[snip]
 import std.regex.traits;

 auto dirEntries(C, RegEx)(in C[] path, RegEx re)
      if(isSomeChar!C && isRegexFor!(Regex, C))
s/Regex/RegEx/
 {
      import std.regex; //full package
      ...
 }
-- Dmitry Olshansky
Dec 06 2013
prev sibling parent reply "Maxim Fomin" <maxim maxim-fomin.ru> writes:
On Friday, 6 December 2013 at 23:19:22 UTC, Walter Bright wrote:
 You can write D code in "C style" and you'll get C results. To 
 get performance advantages from D code, you'll need to write in 
 a structurally different way (as Andrei pointed out).

 Looking through Phobos, there is a lot of code that is not 
 written to take advantage of D's strengths. An apt one 
 discussed here recently is the std.path.buildPath, which is 
 written in "C style", as in allocating memory for its result.

 A structural D style version would accept a range for its 
 output, and the range need not allocate memory. This would be 
 fundamentally faster than the typical C approach.

 This pattern is repeated a lot in Phobos code.


 I believe that most of your points are either insignificant 
 (like array length -
 it is carried together with pointer almost everywhere in C)
I see a lot of C code that does strlen() over and over. I think Tango's XML parser showed what can be done in D versus any known C implementation. It took maximal advantage of D's slicing abilities to avoid copying. Dmitry's regex also showed huge gains over C regex implementations.
This C code is easy to fix. Unlike in D there is no way to fix constant gc allocations and if gc is disabled, you say good buy to: classes, interfaces, exceptions, dynamic arrays, delegates, lambdas, AA arrays, etc. By the way, if you mentioned strlen(), lets compare printf() and writeln().
 or provide some marginal advantage.
Even a marginal advantage is a counter example to the claim "there is no way proper C code can be slower than those languages."
But summing this issues altogether makes D code cannot compete with C code.
 Such advantages are offset by:

 - huge runtime library
C has a huge runtime library, too, it's just that you normally don't notice it because it's not statically linked in. Be that as it may, 2.064 substantially reduced the size of "hello world" programs.
 - constant runtime lib invocation and allocation stuff on heap
This is, as I mentioned, a problem with writing C style code in Phobos.
is it a C style? T[] data = [T, T, T]; or this: T data; auto dg = { return data; }
 - horrible mangling (see
 http://forum.dlang.org/thread/mailman.207.1369611513.13711.digitalmars-d puremagic.com
 examples from hall of D mangling, mangling is so big, that 
 forum software goes
 astray)
Long mangling is not an inherent language characteristic, as that thread suggests improvements.
But this is flaw in implementation. Language and its advantages are dead without implementation. And again, notice that you are speaking about 'hypothetical advantages' (language advantages) which implies two things: 1) current efficiency is worse when comparing with some benchmark 2) despite many years of development, community failed to realize these advantages. This makes me think that probably there is another reason of why code is less efficient, for example fundamental characteristics of the language make him hard to be quick. This is not bad per se, but saying that language code can be faster than C, taking into account some many problems with D, looks like advertisement, rather then technical comparison.
Dec 06 2013
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/6/2013 11:34 PM, Maxim Fomin wrote:
 On Friday, 6 December 2013 at 23:19:22 UTC, Walter Bright wrote:
 I see a lot of C code that does strlen() over and over. I think Tango's XML
 parser showed what can be done in D versus any known C implementation. It took
 maximal advantage of D's slicing abilities to avoid copying.

 Dmitry's regex also showed huge gains over C regex implementations.
This C code is easy to fix.
No it isn't. Have you tried? I have, it's very hard to retrofit a non-trivial C program with carrying around all the string lengths as a separate value. Call just about any C library, and you're back again to strlen(). It's even harder to get away from strlen() in C++ because it inextricably tied std::string to it. A lot of capable people worked for decades on C regexen, yet Dmitry blew them away with his D version.
 Unlike in D there is no way to fix constant gc
 allocations and if gc is disabled, you say good buy to: classes, interfaces,
 exceptions, dynamic arrays, delegates, lambdas, AA arrays, etc.
I think you're way exaggerating. (Note that C has none of those features.)
 By the way, if you mentioned strlen(), lets compare printf() and writeln().
Sure. Feel free.
 But summing this issues altogether makes D code cannot compete with C code.
This is simply not true. You might want to look at Don's presentation at Dconf2013 where he explains that Sociomantic uses D not for romantic reasons but because the performance of it gives a competitive edge.
 examples from hall of D mangling, mangling is so big, that forum software goes
 astray)
Long mangling is not an inherent language characteristic, as that thread suggests improvements.
But this is flaw in implementation. Language and its advantages are dead without implementation.
I've worked with a lot of C compilers that had lousy, buggy implementations and lousy code generation. I'm being careful here to deal with characteristics of the language, not the implementation.
 And again, notice that you are speaking about 'hypothetical advantages'
 (language advantages) which implies two things:
 1) current efficiency is worse when comparing with some benchmark
 2) despite many years of development, community failed to realize these
advantages.

 This makes me think that probably there is another reason of why code is less
 efficient, for example fundamental characteristics of the language make him
hard
 to be quick. This is not bad per se, but saying that language code can be
faster
 than C, taking into account some many problems with D, looks like
advertisement,
 rather then technical comparison.
There are several D projects which show faster runs than C. If your goal is to pragmatically write faster D code than in C, you can do it without too much effort. If your goal is to find problem(s) with D, you can certainly do that, too.
Dec 07 2013
parent reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 07/12/13 09:14, Walter Bright wrote:
 There are several D projects which show faster runs than C. If your goal is to
 pragmatically write faster D code than in C, you can do it without too much
 effort. If your goal is to find problem(s) with D, you can certainly do that,
too.
Well, as the author of a D library which outperforms the C library that inspired it (at least within the limits of its much smaller range of functionality; it's been a bit neglected of late and needs more input) ... ... the practical experience I've had is that more than an outright performance comparison, what it often comes down to is effort vs. results, and the cleanliness/maintainability of the resulting code. This is particularly true when it comes to C code that is designed to be "safe", with all the resulting boilerplate. It's typically possible to match or exceed the performance of a C program with much more concise and easy to follow D code. Another factor that's important here is that C and D in general seem to lead to different design solutions. Even if one has an exact example in C to compare to, the natural thing to do in D is often something different, and that leads to subtle and not-so-subtle implementation differences that in turn affect performance. Example: in the C library that was my inspiration, there's a function which requires the user to pass a buffer, to which it writes a certain set of values which are calculated from the underlying data. I didn't much like the idea of compelling the user to pass a buffer, so when I wrote my D equivalent I used stuff from std.range and std.algorithm to make the function return a lazily-evaluated range that would offer the same values as the C code stored in the buffer array. I assumed this might lead to a small overall performance hit because the C program could just write once to a buffer and re-use the buffer, whereas I might be lazily calculating and re-calculating. Unfortunately it turned out that for whatever reason, my lazily-calculated range was somehow responsible for lots of micro allocations, which slowed things down a lot. (I tried it out again earlier this morning, just to refresh my memory, and it looks like this may no longer be the case; so perhaps something has been fixed here...) So, that in turn led me to another solution again, where instead of an external buffer being passed in, I created an internal cache which could be written to once and re-used again and again and again, never needing to recalculate unless the internal data was changed. Now, _that_ turned out to be significantly faster than the C program, which was almost certainly doing unnecessary recalculation of the buffer -- because it recalculated every time the function was called, whereas my program could rely on the cache, calculate once, and after that just return the slice of calculated values. On the other hand, if I tweaked the internals of the function so that every call _always_ involved recalculating and rewriting to the cache, it was slightly slower than the C -- probably because now it was the C code that was doing less recalculation, because code that was calling the function was calling it once and then using the buffer, rather than calling it multiple times. TL;DR the point is that writing in D gave me the opportunity to spend mental and programming time exploring these different choices and focusing on algorithms and data structures, rather than all the effort and extra LOC required to get a _particular_ idea running in C. That's where the real edge arises.
Dec 07 2013
next sibling parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Saturday, 7 December 2013 at 09:46:11 UTC, Joseph Rushton 
Wakeling wrote:
 On 07/12/13 09:14, Walter Bright wrote:
 There are several D projects which show faster runs than C. If 
 your goal is to
 pragmatically write faster D code than in C, you can do it 
 without too much
 effort. If your goal is to find problem(s) with D, you can 
 certainly do that, too.
Well, as the author of a D library which outperforms the C library that inspired it (at least within the limits of its much smaller range of functionality; it's been a bit neglected of late and needs more input) ... ... the practical experience I've had is that more than an outright performance comparison, what it often comes down to is effort vs. results, and the cleanliness/maintainability of the resulting code. This is particularly true when it comes to C code that is designed to be "safe", with all the resulting boilerplate. It's typically possible to match or exceed the performance of a C program with much more concise and easy to follow D code. Another factor that's important here is that C and D in general seem to lead to different design solutions. Even if one has an exact example in C to compare to, the natural thing to do in D is often something different, and that leads to subtle and not-so-subtle implementation differences that in turn affect performance. Example: in the C library that was my inspiration, there's a function which requires the user to pass a buffer, to which it writes a certain set of values which are calculated from the underlying data. I didn't much like the idea of compelling the user to pass a buffer, so when I wrote my D equivalent I used stuff from std.range and std.algorithm to make the function return a lazily-evaluated range that would offer the same values as the C code stored in the buffer array. I assumed this might lead to a small overall performance hit because the C program could just write once to a buffer and re-use the buffer, whereas I might be lazily calculating and re-calculating. Unfortunately it turned out that for whatever reason, my lazily-calculated range was somehow responsible for lots of micro allocations, which slowed things down a lot. (I tried it out again earlier this morning, just to refresh my memory, and it looks like this may no longer be the case; so perhaps something has been fixed here...) So, that in turn led me to another solution again, where instead of an external buffer being passed in, I created an internal cache which could be written to once and re-used again and again and again, never needing to recalculate unless the internal data was changed. Now, _that_ turned out to be significantly faster than the C program, which was almost certainly doing unnecessary recalculation of the buffer -- because it recalculated every time the function was called, whereas my program could rely on the cache, calculate once, and after that just return the slice of calculated values. On the other hand, if I tweaked the internals of the function so that every call _always_ involved recalculating and rewriting to the cache, it was slightly slower than the C -- probably because now it was the C code that was doing less recalculation, because code that was calling the function was calling it once and then using the buffer, rather than calling it multiple times. TL;DR the point is that writing in D gave me the opportunity to spend mental and programming time exploring these different choices and focusing on algorithms and data structures, rather than all the effort and extra LOC required to get a _particular_ idea running in C. That's where the real edge arises.
This is exactly how I see it too. Well said.
Dec 07 2013
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/7/2013 1:45 AM, Joseph Rushton Wakeling wrote:
 TL;DR the point is that writing in D gave me the opportunity to spend mental
and
 programming time exploring these different choices and focusing on algorithms
 and data structures, rather than all the effort and extra LOC required to get a
 _particular_ idea running in C.  That's where the real edge arises.
I've noticed this too, but I've found it hard to explain to C programmers. To say it a different way, D makes it easy to refactor code to rapidly try out different algorithms. In C, one tends to stick with the original design because it is so much harder to refactor.
Dec 07 2013
next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Dec 07, 2013 at 09:33:53AM -0800, Walter Bright wrote:
 On 12/7/2013 1:45 AM, Joseph Rushton Wakeling wrote:
TL;DR the point is that writing in D gave me the opportunity to spend
mental and programming time exploring these different choices and
focusing on algorithms and data structures, rather than all the
effort and extra LOC required to get a _particular_ idea running in
C.  That's where the real edge arises.
I've noticed this too, but I've found it hard to explain to C programmers. To say it a different way, D makes it easy to refactor code to rapidly try out different algorithms. In C, one tends to stick with the original design because it is so much harder to refactor.
D's metaprogramming capabilities and compile-time introspection help this a lot. Even little things like property (in spite of the problems associated with its current implementation) go a long way in allowing two divergent implementations to share the same API, so that it's much easier to replace one implementation with another -- with no runtime cost. In C, once you've committed to a particular implementation, you can't easily change it without rewriting large chunks of code. While it *is* possible to do this to some extent using void* and function ptrs and so on, that would introduce runtime overhead, whereas in D, it's only a compile-time difference. To avoid runtime overhead you'd have to make heavy use of macros, which quickly becomes a maintenance nightmare. T -- Never step over a puddle, always step around it. Chances are that whatever made it is still dripping.
Dec 07 2013
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 07/12/13 20:00, H. S. Teoh wrote:
 In C, once you've committed to a particular implementation, you can't
 easily change it without rewriting large chunks of code. While it *is*
 possible to do this to some extent using void* and function ptrs and so
 on, that would introduce runtime overhead, whereas in D, it's only a
 compile-time difference. To avoid runtime overhead you'd have to make
 heavy use of macros, which quickly becomes a maintenance nightmare.
When I was writing a lot of C code, I used that approach a great deal -- I learned it from the internals of the GNU Scientific Library and found it very productive and useful for a lot of what I was working on. Ironically, when I later came into touch with C++'s generics I found them very strange and found it difficult to see the sense in them, because I was so used to having and making use of _runtime_ polymorphism in C.
Dec 07 2013
prev sibling next sibling parent reply Manu <turkeyman gmail.com> writes:
On 8 December 2013 03:33, Walter Bright <newshound2 digitalmars.com> wrote:

 On 12/7/2013 1:45 AM, Joseph Rushton Wakeling wrote:

 TL;DR the point is that writing in D gave me the opportunity to spend
 mental and
 programming time exploring these different choices and focusing on
 algorithms
 and data structures, rather than all the effort and extra LOC required to
 get a
 _particular_ idea running in C.  That's where the real edge arises.
I've noticed this too, but I've found it hard to explain to C programmers. To say it a different way, D makes it easy to refactor code to rapidly try out different algorithms. In C, one tends to stick with the original design because it is so much harder to refactor.
True as compared to C, but I wouldn't say this is true in general. tools yet. with great anticipation. Maybe when the D front-end is a library, and tooling has such powerful (and reliable) semantic analysis as the compiler does it may be possible?
Dec 07 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/7/2013 4:46 PM, Manu wrote:
 True as compared to C, but I wouldn't say this is true in general.

tools
 yet.

 great anticipation.
 Maybe when the D front-end is a library, and tooling has such powerful (and
 reliable) semantic analysis as the compiler does it may be possible?
Needing a tool to refactor code is a bit of a mystery to me. I've never used one, and never felt that not having one inhibited me from refactoring.
Dec 07 2013
next sibling parent reply =?UTF-8?B?Ikx1w61z?= Marques" <luis luismarques.eu> writes:
On Sunday, 8 December 2013 at 01:34:34 UTC, Walter Bright wrote:
 Needing a tool to refactor code is a bit of a mystery to me. 
 I've never used one, and never felt that not having one 
 inhibited me from refactoring.
Do you include renaming a variable as refactoring? I'm sure you can appreciate that having a tool automate that process reduces errors, and is faster (a generic text replace function is dangerous).
Dec 07 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/7/2013 5:49 PM, "Luís Marques" <luis luismarques.eu>" wrote:
 Do you include renaming a variable as refactoring? I'm sure you can appreciate
 that having a tool automate that process reduces errors, and is faster (a
 generic text replace function is dangerous).
I've done global renaming, and it just hasn't been much of an issue. I suppose if I did it a lot a tool would help, but I don't do it often enough to matter. I also usually run grep first to make sure I don't bork things up completely :-) But when I talk about refactoring, I mean things like changing data structures and algorithms. Renaming things is pretty far over on the trivial end, and isn't going to help your program run any faster.
Dec 07 2013
parent reply =?UTF-8?B?Ikx1w61z?= Marques" <luis luismarques.eu> writes:
On Sunday, 8 December 2013 at 01:59:15 UTC, Walter Bright wrote:
 But when I talk about refactoring, I mean things like changing 
 data structures and algorithms. Renaming things is pretty far 
 over on the trivial end, and isn't going to help your program 
 run any faster.
Well, I was just so surprised by your answer that I was looking for common ground :-)
Dec 07 2013
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sun, Dec 08, 2013 at 03:01:03AM +0100, digitalmars-d-bounces puremagic.com
wrote:
 On Sunday, 8 December 2013 at 01:59:15 UTC, Walter Bright wrote:
But when I talk about refactoring, I mean things like changing
data structures and algorithms. Renaming things is pretty far over
on the trivial end, and isn't going to help your program run any
faster.
Well, I was just so surprised by your answer that I was looking for common ground :-)
OTOH, I was quite confused the first time somebody talked about "refactoring" to refer to variable renaming. To me, refactoring means reorganizing your code, like factoring out common code into separate functions, and moving stuff around modules, and substituting algorithms; the kind of major code surgery where you go through every line (or every block) and re-stitch things together in a new (and hopefully cleaner) way. Variable renaming sounds almost like a joke to me in comparison. I was quite taken aback that people would think "variable renaming" when they say "refactoring", to be quite honest. Or maybe this is just another one of those cultural old age indicators? Has the term "refactoring" shifted to mean "variable renaming" among the younger coders these days? Genuine question. I'm baffled that these two things could even remotely be considered similar things. T -- Don't throw out the baby with the bathwater. Use your hands...
Dec 07 2013
next sibling parent Marco Leise <Marco.Leise gmx.de> writes:
Am Sat, 7 Dec 2013 21:35:39 -0800
schrieb "H. S. Teoh" <hsteoh quickfur.ath.cx>:

 On Sun, Dec 08, 2013 at 03:01:03AM +0100, digitalmars-d-bounces puremagic.com
wrote:
 On Sunday, 8 December 2013 at 01:59:15 UTC, Walter Bright wrote:
But when I talk about refactoring, I mean things like changing
data structures and algorithms. Renaming things is pretty far over
on the trivial end, and isn't going to help your program run any
faster.
Well, I was just so surprised by your answer that I was looking for common ground :-)
OTOH, I was quite confused the first time somebody talked about "refactoring" to refer to variable renaming. To me, refactoring means reorganizing your code, like factoring out common code into separate functions, and moving stuff around modules, and substituting algorithms; the kind of major code surgery where you go through every line (or every block) and re-stitch things together in a new (and hopefully cleaner) way. Variable renaming sounds almost like a joke to me in comparison. I was quite taken aback that people would think "variable renaming" when they say "refactoring", to be quite honest. Or maybe this is just another one of those cultural old age indicators? Has the term "refactoring" shifted to mean "variable renaming" among the younger coders these days? Genuine question. I'm baffled that these two things could even remotely be considered similar things. T
IDEs offer symbol renaming in their catalog of automated refactorings. Often a single key press (like F2) makes safely renaming a symbol to more descriptive name so easy that refers to that menu with automated refactorings and most probably to the rename command :) -- Marco
Dec 08 2013
prev sibling next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 12/08/2013 06:35 AM, H. S. Teoh wrote:
 ...

 Or maybe this is just another one of those cultural old age indicators?
 Has the term "refactoring" shifted to mean "variable renaming" among the
 younger coders these days? Genuine question. I'm baffled that these two
 things could even remotely be considered similar things.


 T
"Refactoring" means changing code without changing its functional behaviour.
Dec 08 2013
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sun, Dec 08, 2013 at 10:30:37AM +0100, Timon Gehr wrote:
 On 12/08/2013 06:35 AM, H. S. Teoh wrote:
...

Or maybe this is just another one of those cultural old age
indicators?  Has the term "refactoring" shifted to mean "variable
renaming" among the younger coders these days? Genuine question. I'm
baffled that these two things could even remotely be considered
similar things.


T
"Refactoring" means changing code without changing its functional behaviour.
OK, but still, renaming must be the most trivial of them all? T -- The easy way is the wrong way, and the hard way is the stupid way. Pick one.
Dec 08 2013
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 12/08/2013 04:37 PM, H. S. Teoh wrote:
 On Sun, Dec 08, 2013 at 10:30:37AM +0100, Timon Gehr wrote:
 On 12/08/2013 06:35 AM, H. S. Teoh wrote:
 ...

 Or maybe this is just another one of those cultural old age
 indicators?  Has the term "refactoring" shifted to mean "variable
 renaming" among the younger coders these days? Genuine question. I'm
 baffled that these two things could even remotely be considered
 similar things.


 T
"Refactoring" means changing code without changing its functional behaviour.
OK, but still, renaming must be the most trivial of them all? T
I don't think so. Eg. stripping comments and formatting is more trivial. :o) Also, it is not too easy, an entire D front end and some additional static analysis is probably required to automatically rename symbols in a reliable way.
Dec 08 2013
prev sibling next sibling parent =?UTF-8?B?Ikx1w61z?= Marques" <luis luismarques.eu> writes:
On Sunday, 8 December 2013 at 05:37:09 UTC, H. S. Teoh wrote:
 Or maybe this is just another one of those cultural old age 
 indicators?
 Has the term "refactoring" shifted to mean "variable renaming" 
 among the
 younger coders these days? Genuine question. I'm baffled that 
 these two
 things could even remotely be considered similar things.
Yes, renaming is not very representative of "refactoring", which generally is intended to mean more complex reorganizations. (That's why I asked Walter if he considered renaming something to be refactoring, before arguing my point). Still, it seems reasonable to me to include renaming as part of the refactoring tools, conceptually and practically. I chose the trivial task of renaming for the example because it was something everybody does, is easily understandable and adequate to the argument: a human and a generic editor function/macro do not do it as quickly and safely as a specialized refactoring tool which understands the language (which will fix all static uses of the old name, and can point out other places for review).
Dec 09 2013
prev sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
On Sunday, 8 December 2013 at 05:37:09 UTC, H. S. Teoh wrote:
 Or maybe this is just another one of those cultural old age 
 indicators?
 Has the term "refactoring" shifted to mean "variable renaming" 
 among the
 younger coders these days? Genuine question. I'm baffled that 
 these two
 things could even remotely be considered similar things.
Refactoring is changing your code, without changing what it does. It is done in order to make your code easier to understand, more maintainable, more testable, or whatever. Both extracting some code into a function and renaming a variable are refactoring operations.
Dec 09 2013
next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Dec 10, 2013 at 02:42:05AM +0100, deadalnix wrote:
 On Sunday, 8 December 2013 at 05:37:09 UTC, H. S. Teoh wrote:
Or maybe this is just another one of those cultural old age
indicators?  Has the term "refactoring" shifted to mean "variable
renaming" among the younger coders these days? Genuine question. I'm
baffled that these two things could even remotely be considered
similar things.
Refactoring is changing your code, without changing what it does. It is done in order to make your code easier to understand, more maintainable, more testable, or whatever. Both extracting some code into a function and renaming a variable are refactoring operations.
OK, but isn't renaming among the more trivial refactoring operations? Albeit, granted, it is probably also the most easily automated. Splitting an overly-long function into smaller pieces is a more meaningful (IMO) refactoring operation, but probably much more difficult to automate (if it even can be). T -- Nearly all men can stand adversity, but if you want to test a man's character, give him power. -- Abraham Lincoln
Dec 09 2013
prev sibling next sibling parent reply Manu <turkeyman gmail.com> writes:
On 10 December 2013 11:42, deadalnix <deadalnix gmail.com> wrote:

 On Sunday, 8 December 2013 at 05:37:09 UTC, H. S. Teoh wrote:

 Or maybe this is just another one of those cultural old age indicators?
 Has the term "refactoring" shifted to mean "variable renaming" among the
 younger coders these days? Genuine question. I'm baffled that these two
 things could even remotely be considered similar things.
Refactoring is changing your code, without changing what it does. It is done in order to make your code easier to understand, more maintainable, more testable, or whatever. Both extracting some code into a function and renaming a variable are refactoring operations.
With effective semantic analysis, there are many useful refactoring tools that become possible. Renaming things is the simplest and probably most common, so it's usually a good starting point. Other possibilities for instance: * Not just renaming functions, but changing it's signature in general. If types are changed, then some handling likely needs to be performed (or at least alerted) at call sites * Renaming modules (required when moving source files within the source tree); update imports, etc. * Reordering function parameters automatically reorders terms supplied to calls of the function (how many times have you wanted to reverse 2 function parameters, and they've both been int? impossible to update all references without error) * Assignments to delegates may automatically produce an appropriate function stub, along with typical function name 'On[event-name-being-assigned-to]() { }' or whatever is conventional, and proper parameters * Hilight some text, 'refactor into own function' can move a block of selected code into a separate function, and correctly update the call site with the appropriate call, automatically adding function parameters for previously referenced local variables (seriously, how many times have you done this, and then carefully had to chase up the locals that are referenced to build the parameter list?) * In addition to the prior point, scan for additional instances of the code you just refactored, and offer to also replace with calls to the new function (again, properly hooking up arguments) * Auto-magic property/accessor generation from member variables * Auto-magically produce stubs for methods declared in interfaces * And many more possibilities... * Produce tools to statically encorce project coding standards; ie, produce red underlines beneath terms in the editor that don't conform to project standards If you develop habits around these sorts of tools, you can be far more productive, and then when they are gone, it feels kinda like trying to use a mouse without a wheel (ever tried that recently?). They can also significantly increase the quality of code overall, since trivial refactoring in many forms becomes an instantaneous operation rather than wasting programmers working hours. And tools like project coding standards enforcement will keep the slackers in check.
Dec 09 2013
parent reply Xavier Bigand <flamaros.xavier gmail.com> writes:
Le 10/12/2013 03:54, Manu a écrit :
 On 10 December 2013 11:42, deadalnix <deadalnix gmail.com
 <mailto:deadalnix gmail.com>> wrote:

     On Sunday, 8 December 2013 at 05:37:09 UTC, H. S. Teoh wrote:

         Or maybe this is just another one of those cultural old age
         indicators?
         Has the term "refactoring" shifted to mean "variable renaming"
         among the
         younger coders these days? Genuine question. I'm baffled that
         these two
         things could even remotely be considered similar things.


     Refactoring is changing your code, without changing what it does. It
     is done in order to make your code easier to understand, more
     maintainable, more testable, or whatever.

     Both extracting some code into a function and renaming a variable
     are refactoring operations.


 With effective semantic analysis, there are many useful refactoring
 tools that become possible.
 Renaming things is the simplest and probably most common, so it's
 usually a good starting point.

 Other possibilities for instance:
   * Not just renaming functions, but changing it's signature in general.
 If types are changed, then some handling likely needs to be performed
 (or at least alerted) at call sites
   * Renaming modules (required when moving source files within the
 source tree); update imports, etc.
   * Reordering function parameters automatically reorders terms supplied
 to calls of the function (how many times have you wanted to reverse 2
 function parameters, and they've both been int? impossible to update all
 references without error)
   * Assignments to delegates may automatically produce an appropriate
 function stub, along with typical function name
 'On[event-name-being-assigned-to]() { }' or whatever is conventional,
 and proper parameters
   * Hilight some text, 'refactor into own function' can move a block of
 selected code into a separate function, and correctly update the call
 site with the appropriate call, automatically adding function parameters
 for previously referenced local variables (seriously, how many times
 have you done this, and then carefully had to chase up the locals that
 are referenced to build the parameter list?)
   * In addition to the prior point, scan for additional instances of the
 code you just refactored, and offer to also replace with calls to the
 new function (again, properly hooking up arguments)
   * Auto-magic  property/accessor generation from member variables
   * Auto-magically produce stubs for methods declared in interfaces
   * And many more possibilities...
   * Produce tools to statically encorce project coding standards; ie,
 produce red underlines beneath terms in the editor that don't conform to
 project standards
* Add/Remove import automatically * Underline in red errors without compiling
Dec 10 2013
parent "Atila Neves" <atila.neves gmail.com> writes:
   * Underline in red errors without compiling
I get this one for free in Emacs with flycheck. It's not _really_ without compiling since it calls the compiler behind my back, but it's essentially the same thing.
Dec 10 2013
prev sibling next sibling parent reply Manu <turkeyman gmail.com> writes:
On 10 December 2013 12:32, H. S. Teoh <hsteoh quickfur.ath.cx> wrote:

 On Tue, Dec 10, 2013 at 02:42:05AM +0100, deadalnix wrote:
 On Sunday, 8 December 2013 at 05:37:09 UTC, H. S. Teoh wrote:
Or maybe this is just another one of those cultural old age
indicators?  Has the term "refactoring" shifted to mean "variable
renaming" among the younger coders these days? Genuine question. I'm
baffled that these two things could even remotely be considered
similar things.
Refactoring is changing your code, without changing what it does. It is done in order to make your code easier to understand, more maintainable, more testable, or whatever. Both extracting some code into a function and renaming a variable are refactoring operations.
OK, but isn't renaming among the more trivial refactoring operations? Albeit, granted, it is probably also the most easily automated. Splitting an overly-long function into smaller pieces is a more meaningful (IMO) refactoring operation, but probably much more difficult to automate (if it even can be).
It can be done. I use tools like this. One of my favourites goes: * highlight text * click 'refactor' (or hotkey) * type name for function New function is produced containing the selected code. If the selected code was an expression, then the function is generated with a return statement and the appropriate return type. If the selected code relied on any local variables, they are automatically added as function parameters. The call-site is updated with the new function call, and the local variables previously referenced passed to the function as arguments appropriately. There's room for an extension to this tool which might scan for the same code throughout the project, and offer to replace the same pattern with calls to the new function, again, automatically filling out the function parameters. This is only one of many useful tools that become available once tooling has a thorough semantic analysis library available. It would be amazing if there was some real effort put into a convenient tooling API in the DMD front-end lib.
Dec 09 2013
next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Manu:

 There's room for an extension to this tool which might scan for
If you keep adding similar features to an IDE, you perceive the code less and less as an amount of text lines and more and more like something fluid, where its semantics become less tied to its look. And this is good. This does not require to save source code in a format different from the usual text.
 It would be amazing if there was some real effort put into a 
 convenient tooling API in the DMD front-end lib.
I agree. Making D/D-front-end a bit more IDE-friendly is good. Bye, bearophile
Dec 09 2013
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/9/2013 7:04 PM, Manu wrote:
 It would be amazing if there was some real effort put into a convenient tooling
 API in the DMD front-end lib.
Generally, the best tools come from people who make them to please themselves, not from people who make them to someone else's spec. I suspect you'd make a very good refactoring tool. How about it?
Dec 09 2013
next sibling parent Manu <turkeyman gmail.com> writes:
On 10 December 2013 13:40, Walter Bright <newshound2 digitalmars.com> wrote:

 On 12/9/2013 7:04 PM, Manu wrote:

 It would be amazing if there was some real effort put into a convenient
 tooling
 API in the DMD front-end lib.
Generally, the best tools come from people who make them to please themselves, not from people who make them to someone else's spec. I suspect you'd make a very good refactoring tool. How about it?
Not without a powerful semantic analysis library I wouldn't. Other's have spent years now on semantic analysis libraries trying to duplicate the understanding of the code that DMD must already have while compiling.
Dec 09 2013
prev sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Dec 10, 2013 at 02:01:09PM +1000, Manu wrote:
 On 10 December 2013 13:40, Walter Bright <newshound2 digitalmars.com> wrote:
 
 On 12/9/2013 7:04 PM, Manu wrote:

 It would be amazing if there was some real effort put into a
 convenient tooling API in the DMD front-end lib.
Generally, the best tools come from people who make them to please themselves, not from people who make them to someone else's spec. I suspect you'd make a very good refactoring tool. How about it?
Not without a powerful semantic analysis library I wouldn't. Other's have spent years now on semantic analysis libraries trying to duplicate the understanding of the code that DMD must already have while compiling.
We need to work on the "compiler as a library" project. A *lot* of code analysis tools are unnecessarily duplicating effort because the compiler is opaque; if compiler internals could be leveraged, this would open the door for powerful code analysis/manipulation tools, and they'd be *accurate*, and won't need endless maintenance to keep up with compiler changes. Well, they still require maintenance, but the effort would be far easier if much of the codebase is shared. T -- It only takes one twig to burn down a forest.
Dec 09 2013
parent reply "Francesco Cattoglio" <francesco.cattoglio gmail.com> writes:
On Tuesday, 10 December 2013 at 07:51:56 UTC, H. S. Teoh wrote:
 We need to work on the "compiler as a library" project.
I hope everyone agrees on this. My wild guess is that the project will be the next "big thing" to work on after the frontend is moved to D.
Dec 10 2013
parent "Daniel Murphy" <yebblies nospamgmail.com> writes:
"Francesco Cattoglio" <francesco.cattoglio gmail.com> wrote in message 
news:stpuvkzasctgyoryunbg forum.dlang.org...
 On Tuesday, 10 December 2013 at 07:51:56 UTC, H. S. Teoh wrote:
 We need to work on the "compiler as a library" project.
I hope everyone agrees on this. My wild guess is that the project will be the next "big thing" to work on after the frontend is moved to D.
That's the plan.
Dec 10 2013
prev sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Dec 10, 2013 at 12:54:02PM +1000, Manu wrote:
[...]
 If you develop habits around these sorts of tools, you can be far more
 productive, and then when they are gone, it feels kinda like trying to
 use a mouse without a wheel (ever tried that recently?).
[...] Agree with your points, but just had to point out that I don't even *use* the mouse (well, barely), so your example is moot. :-P T -- A computer doesn't mind if its programs are put to purposes that don't match their names. -- D. Knuth
Dec 09 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/9/2013 11:55 PM, H. S. Teoh wrote:
 Agree with your points, but just had to point out that I don't even
 *use* the mouse (well, barely), so your example is moot. :-P
A vi user? :-)
Dec 10 2013
next sibling parent Iain Buclaw <ibuclaw gdcproject.org> writes:
On 10 December 2013 08:29, Walter Bright <newshound2 digitalmars.com> wrote:
 On 12/9/2013 11:55 PM, H. S. Teoh wrote:
 Agree with your points, but just had to point out that I don't even
 *use* the mouse (well, barely), so your example is moot. :-P
A vi user? :-)
Worse, he's from the emacs crowd. :o)
Dec 10 2013
prev sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Dec 10, 2013 at 08:59:26AM +0000, Iain Buclaw wrote:
 On 10 December 2013 08:29, Walter Bright <newshound2 digitalmars.com> wrote:
 On 12/9/2013 11:55 PM, H. S. Teoh wrote:
 Agree with your points, but just had to point out that I don't even
 *use* the mouse (well, barely), so your example is moot. :-P
A vi user? :-)
Worse, he's from the emacs crowd. :o)
A vim user, actually. And that's text-mode vim, not gvim or any of that fancy stuff. In monochrome. (Also, a ratpoison user. I don't do GUIs.) T -- The best way to destroy a cause is to defend it poorly.
Dec 10 2013
parent "Paulo Pinto" <pjmlp progtools.org> writes:
On Tuesday, 10 December 2013 at 15:09:37 UTC, H. S. Teoh wrote:
 On Tue, Dec 10, 2013 at 08:59:26AM +0000, Iain Buclaw wrote:
 On 10 December 2013 08:29, Walter Bright 
 <newshound2 digitalmars.com> wrote:
 On 12/9/2013 11:55 PM, H. S. Teoh wrote:
 Agree with your points, but just had to point out that I 
 don't even
 *use* the mouse (well, barely), so your example is moot. :-P
A vi user? :-)
Worse, he's from the emacs crowd. :o)
A vim user, actually. And that's text-mode vim, not gvim or any of that fancy stuff. In monochrome. (Also, a ratpoison user. I don't do GUIs.) T
That makes my GUI soul bleed. :) -- Paulo
Dec 10 2013
prev sibling next sibling parent reply Manu <turkeyman gmail.com> writes:
On 8 December 2013 11:34, Walter Bright <newshound2 digitalmars.com> wrote:

 On 12/7/2013 4:46 PM, Manu wrote:

 True as compared to C, but I wouldn't say this is true in general.

 such tools
 yet.

 with
 great anticipation.
 Maybe when the D front-end is a library, and tooling has such powerful
 (and
 reliable) semantic analysis as the compiler does it may be possible?
Needing a tool to refactor code is a bit of a mystery to me. I've never used one, and never felt that not having one inhibited me from refactoring.
Well I'd suspect that given time to use one for a significant portion of time, you will come to appreciate how much time it can save :) At least in certain types of code, which perhaps you don't spend an awful lot of time writing? I find 'client code' tends to be subject to a higher frequency of trivial changes and refactorings. This sort of code is less concise, more random; just stuff that does stuff or responds to events or whatever written around the place. Systems code like compilers tend to be a lot more succinct, self contained and well structured which, supports simpler refactoring internally, but any such change may require a huge amount of client code to be re-jigged. It's nice to reliably automate this sort of thing. Trust me, robust refactoring tools save a lot of time! :)
Dec 07 2013
next sibling parent =?UTF-8?B?Ikx1w61z?= Marques" <luis luismarques.eu> writes:
On Sunday, 8 December 2013 at 02:07:50 UTC, Manu wrote:
 Trust me, robust refactoring tools save a lot of time! :)
More than just time, Walter has shown in the past that he appreciates safety and safe practises. For instance, I respect his decision about the limitations of version(), even though it seems limiting to me. Trusting human refactoring to the detriment of a tool seems, comparatively, insane, and cowboy programming, with no trade-off benefit :-)
Dec 07 2013
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/7/2013 6:07 PM, Manu wrote:
 At least in certain types of code, which perhaps you don't spend an awful lot
of
 time writing?
Funny thing about that. Sometimes I'll spend all day on a piece of code, then check it in. I'm surprised that the diffs show my changes were very small. I suppose I spend far more time thinking about code than writing it.
Dec 07 2013
next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Dec 07, 2013 at 08:07:08PM -0800, Walter Bright wrote:
 On 12/7/2013 6:07 PM, Manu wrote:
At least in certain types of code, which perhaps you don't spend an
awful lot of time writing?
Funny thing about that. Sometimes I'll spend all day on a piece of code, then check it in. I'm surprised that the diffs show my changes were very small.
At my job, we actually take pride in minimizing our diffs. My manager once mentioned that sometimes, it could take days to produce a one-line diff, because it takes that long to (1) find the bug and (2) figure out the least intrusive way to fix it. Diffs that are obviously larger than necessary (esp. with frivolous whitespace changes) will often be rejected during the code review process, or, at the very least, the submitter will be told to rework his diffs to avoid touching stuff unrelated to the actual code fix. (Sadly, this isn't done during feature branch merges, and so a lot of poor code tends to creep in through feature branches. Sigh. Can't have your cake and eat it too, I guess.)
 I suppose I spend far more time thinking about code than writing it.
When I was in school, I was taught to do that. I think I take it to the extremes, though. Sometimes I'd think about a piece of code for months before actually writing it because I just can't sort out all the details in my head. Often I have to force myself to just start writing the code, and then the details I was worried about tend to work themselves out quite nicely. Now obviously, starting to write code without the slightest idea about what kind of algorithms should be used, etc., is a bad idea, and tends to lead to bad, non-maintainable code. But thinking about it too much leads to non-productivity. So there's a balance to be struck somewhere, but I'm still trying to figure out where that is. :) T -- "A one-question geek test. If you get the joke, you're a geek: Seen on a California license plate on a VW Beetle: 'FEATURE'..." -- Joshua D. Wachs - Natural Intelligence, Inc.
Dec 07 2013
prev sibling parent Manu <turkeyman gmail.com> writes:
On 8 December 2013 14:07, Walter Bright <newshound2 digitalmars.com> wrote:

 On 12/7/2013 6:07 PM, Manu wrote:

 At least in certain types of code, which perhaps you don't spend an awful
 lot of
 time writing?
Funny thing about that. Sometimes I'll spend all day on a piece of code, then check it in. I'm surprised that the diffs show my changes were very small. I suppose I spend far more time thinking about code than writing it.
Precisely my point. I have a strong suspicion that you've spent many consecutive years now writing code in that way. I don't think the job of a systems/tech programmer is a close approximation of probably the vast majority of modern programmer's days. I think the bulk majority of programmers are employed to bash out code and string things together to produce their particular product. Be it on the web, or some application, or whatever. I think 'most' code is the stringing together of libraries in ways specific to the project requirements, and that sort of code _really_ benefits from powerful tooling. Also most projects have unstable goal-posts, which leads to inevitable refactoring of your project whenever project requirements change. Current D tooling offers... well, barely anything. VisualD and the like are good first steps. We basically have syntax highlighting, and to a very limited degree, suggestion + auto-completion. I couldn't (**wouldn't) work without those as a minimum, but these only begin to scratch the surface with). I've often wondered what the DMD front end could offer if it was properly extracted into a library with a well designed API for tooling projects to interact with. One of the hard problems is that you want tools to make proper suggestions and highlight errors and stuff in realtime, on code as it's being typed, which doesn't actually compile due to incomplete expressions or errors. I suspect systems like the CTFE system, which is able to perform semantic analysis (and execution) on subsets of code, ie, it seems it can evaluate expressions even where there are compile errors in surrounding code. That old Eclipse tool that offered a 'intermediate' view that unfolded mixins and template expansion was awesome. I would find that so useful!
Dec 07 2013
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Walter Bright:

 Needing a tool to refactor code is a bit of a mystery to me. 
 I've never used one, and never felt that not having one 
 inhibited me from refactoring.
When you have experience and you know how to do things, you have a great temptation of keep doing it, and limit your explorations and study of alternatives. But this could lead to missed opportunities and crystallization of skills. One strategy to avoid this pitfall is to allocate some time to learn different things, but often this is not enough. A way to improve the situation is to work for some time with a person aged very differently. Younger people seem ignorant, but teaching them for some time is not a waste of time because their age helps them being not burdened by very old ways of doing things, and they teach back you a lot. Refactoring tools are an important part of modern programming. If you don't understand why, then I suggest you to stop debugging D for few days, install a modern IDE, find a good open source Java project on GitHub, and follow some tutorials that explain how to refactor Java code. In few days you could send some patches to the project and learn what modern IDEs do. It's even better if you find some younger person willing to work with you in this, but this is not essential :-) Bye, bearophile
Dec 07 2013
prev sibling parent "Atila Neves" <atila.neves gmail.com> writes:
On Sunday, 8 December 2013 at 01:34:34 UTC, Walter Bright wrote:
 On 12/7/2013 4:46 PM, Manu wrote:
 True as compared to C, but I wouldn't say this is true in 
 general.

 have any such tools
 yet.

 such tooling with
 great anticipation.
 Maybe when the D front-end is a library, and tooling has such 
 powerful (and
 reliable) semantic analysis as the compiler does it may be 
 possible?
Needing a tool to refactor code is a bit of a mystery to me. I've never used one, and never felt that not having one inhibited me from refactoring.
I used to think that until I tried rope in Emacs for editing Python code. After that I wanted proper refactorign for everything else as well. Renaming doesn't require checks, extracting a method is just easy... it's saved me a _lot_ of time and I just couldn't imagine having done some of the work I did without it.
Dec 09 2013
prev sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 08/12/13 01:46, Manu wrote:
 True as compared to C, but I wouldn't say this is true in general.
Let's give praise where praise is due -- with D, the ease of refactoring is _entirely_ down to the language. You get it without needing an IDE to support you. It's a very impressive achievement. That said, this is a _positive_ reason for pursuing the other tools -- imagine how much better it could be, with IDE refactoring support and other such benefits.
Dec 08 2013
prev sibling next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/6/13 11:34 PM, Maxim Fomin wrote:
 On Friday, 6 December 2013 at 23:19:22 UTC, Walter Bright wrote:
 Dmitry's regex also showed huge gains over C regex implementations.
This C code is easy to fix.
That's not quite so. The second best after ctRegex is the Javascript V8 engine, which does the same thing - JITs the regex. This is not an easy fix for a traditional engine written in C. Andrei
Dec 07 2013
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/6/13 11:34 PM, Maxim Fomin wrote:
 And again, notice that you are speaking about 'hypothetical advantages'
 (language advantages) which implies two things:
 1) current efficiency is worse when comparing with some benchmark
 2) despite many years of development, community failed to realize these
 advantages.

 This makes me think that probably there is another reason of why code is
 less efficient, for example fundamental characteristics of the language
 make him hard to be quick. This is not bad per se, but saying that
 language code can be faster than C, taking into account some many
 problems with D, looks like advertisement, rather then technical
 comparison.
That is not advertisement, and your tenor is dissonant in this forum. This community, as pretty much any other languages', is looking for strategic advantages compared to other languages, and how to exploit them. It is perfectly natural to look for places where D semantics may lead to better code and exploit those at their best. D code is easy to make fast. This has been shown numerous times in various contexts. There are also convenience features that are not aimed at performance, bugs, and insufficiencies. Enumerating them in an attempt to argue that D code is not or cannot be faster than C is missing the point. Andrei
Dec 07 2013
parent reply "Maxim Fomin" <maxim maxim-fomin.ru> writes:
On Saturday, 7 December 2013 at 15:42:54 UTC, Andrei Alexandrescu 
wrote:
 On 12/6/13 11:34 PM, Maxim Fomin wrote:
 And again, notice that you are speaking about 'hypothetical 
 advantages'
 (language advantages) which implies two things:
 1) current efficiency is worse when comparing with some 
 benchmark
 2) despite many years of development, community failed to 
 realize these
 advantages.

 This makes me think that probably there is another reason of 
 why code is
 less efficient, for example fundamental characteristics of the 
 language
 make him hard to be quick. This is not bad per se, but saying 
 that
 language code can be faster than C, taking into account some 
 many
 problems with D, looks like advertisement, rather then 
 technical
 comparison.
That is not advertisement, and your tenor is dissonant in this forum.
This is not the first time you are commenting my 'tenor' which is an unusual thing. Of course, it is dissosant in this 'forum', if you mean by forum constant advertisement which does not express full information about the subject (like reddit link bombing). When speaking about D newsgroups in general, I found myself pretty neutral.
 This community, as pretty much any other languages', is looking 
 for strategic advantages compared to other languages, and how 
 to exploit them. It is perfectly natural to look for places 
 where D semantics may lead to better code and exploit those at 
 their best.
I am happy with looking at strategic advantages, but strategic advantages are not excuse for current disadvantages. Perhaps some major bugs should be fixed before promoting the language?
 D code is easy to make fast. This has been shown numerous times 
 in various contexts. There are also convenience features that 
 are not aimed at performance, bugs, and insufficiencies. 
 Enumerating them in an attempt to argue that D code is not or 
 cannot be faster than C is missing the point.


 Andrei
I remember regex implementation as example of superior performance. I don't remember any other libraries, projects (perhaps vide.d in some activities). Meanwhile, many people regular complain on some problems. I do remember only one thread (it was started by Jonathan) which was devoted to single purpose of describing how D language is usefull and good to solve problems.
Dec 07 2013
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/7/13 8:23 AM, Maxim Fomin wrote:
 On Saturday, 7 December 2013 at 15:42:54 UTC, Andrei Alexandrescu wrote:
 That is not advertisement, and your tenor is dissonant in this forum.
This is not the first time you are commenting my 'tenor' which is an unusual thing.
I agree it's unusual.
 Of course, it is dissosant in this 'forum', if you mean
 by forum constant advertisement which does not express full information
 about the subject (like reddit link bombing).
What links did I post on reddit that would be inappropriate for the charter of that site?
 When speaking about D
 newsgroups in general, I found myself pretty neutral.
I'd find that difficult to argue. Consider: http://forum.dlang.org/thread/l37h5s$2gd8$1 digitalmars.com?page=3#post-wllaccthfrnvquuhsbsr:40forum.dlang.org It was a factual, undeniable piece of good news: some nontrivial D code was starting daily use at Facebook. There was no sensationalizing to complain about. Yet there had to be one smartaleck comment, and guess who it came from. Just browse through your own posting history. Making one's name a great predictor (NLP machine learning algos for sentiment analysis would love it) for snarky comments, facts be damned, is hardly being neutral.
 This community, as pretty much any other languages', is looking for
 strategic advantages compared to other languages, and how to exploit
 them. It is perfectly natural to look for places where D semantics may
 lead to better code and exploit those at their best.
I am happy with looking at strategic advantages, but strategic advantages are not excuse for current disadvantages. Perhaps some major bugs should be fixed before promoting the language?
I appreciate your very strong contributions to the bug reports and associated discussions. It seems you are worried we're too busy keeping in denial about our shortcomings and feel the urge to keep us grounded every so often. Worry not. There is time to focus, and there is time to chillax.
 D code is easy to make fast. This has been shown numerous times in
 various contexts. There are also convenience features that are not
 aimed at performance, bugs, and insufficiencies. Enumerating them in
 an attempt to argue that D code is not or cannot be faster than C is
 missing the point.
I remember regex implementation as example of superior performance. I don't remember any other libraries, projects (perhaps vide.d in some activities). Meanwhile, many people regular complain on some problems. I do remember only one thread (it was started by Jonathan) which was devoted to single purpose of describing how D language is usefull and good to solve problems.
I agree we should fix problems that prevent maximum performance from being achieved simply and easily. Andrei
Dec 07 2013
prev sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Saturday, 7 December 2013 at 16:23:17 UTC, Maxim Fomin wrote:
 This is not the first time you are commenting my 'tenor' which 
 is an unusual thing. Of course, it is dissosant in this 
 'forum', if you mean by forum constant advertisement which does 
 not express full information about the subject (like reddit 
 link bombing). When speaking about D newsgroups in general, I 
 found myself pretty neutral.
Sorry you aren't. Your argument do make a lot of sense. D do not devirtualize ? Can you explain me how C does ? You are actually mentioning drawback that C have too, the only difference between C and D on these is that C do not have the advantages.
 I remember regex implementation as example of superior 
 performance. I don't remember any other libraries, projects 
 (perhaps vide.d in some activities). Meanwhile, many people 
 regular complain on some problems. I do remember only one 
 thread (it was started by Jonathan) which was devoted to single 
 purpose of describing how D language is usefull and good to 
 solve problems.
Tango.xml is another example. When it come to text processing, D is a no match to C.
Dec 07 2013
prev sibling next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
07-Dec-2013 02:20, Walter Bright пишет:
 "there is no way proper C code can be slower than those languages."
 3. Function inlining has generally been shown to be of tremendous value
 in optimization. D has access to all the source code in the program, or
 at least as much as you're willing to show it, and can inline across
 modules.
Uh-oh. I'd avoid advertising this particular point until after a critical bug is fixed: https://d.puremagic.com/issues/show_bug.cgi?id=10985 Applies to all 3 compilers. Otherwise - it's spot on. D has many ways to be typically "faster then Cee" ;) -- Dmitry Olshansky
Dec 06 2013
parent reply Johannes Pfau <nospam example.com> writes:
Am Sat, 07 Dec 2013 03:12:02 +0400
schrieb Dmitry Olshansky <dmitry.olsh gmail.com>:

 07-Dec-2013 02:20, Walter Bright =D0=BF=D0=B8=D1=88=D0=B5=D1=82:
 "there is no way proper C code can be slower than those languages."
=20
 3. Function inlining has generally been shown to be of tremendous
 value in optimization. D has access to all the source code in the
 program, or at least as much as you're willing to show it, and can
 inline across modules.
=20 Uh-oh. I'd avoid advertising this particular point until after a=20 critical bug is fixed: https://d.puremagic.com/issues/show_bug.cgi?id=3D10985 Applies to all 3 compilers. =20 Otherwise - it's spot on. D has many ways to be typically "faster then Cee" ;) =20
But cross-module inlining can't be done if you have shared libraries cause you can not know if the other module is in a shared library right? If you inlined such code it wouldn't get updated if the shared library was updated and you'd have two versions of the code around... I see only 2 solution to avoid this: (1) If the source files are compiled at once it's safe to assume they must be part of the same library and inlining is safe (2) The linker of course knows how objects fit together, so LTO.
Dec 07 2013
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
07-Dec-2013 13:32, Johannes Pfau пишет:
 Am Sat, 07 Dec 2013 03:12:02 +0400
 schrieb Dmitry Olshansky <dmitry.olsh gmail.com>:

 07-Dec-2013 02:20, Walter Bright пишет:
 "there is no way proper C code can be slower than those languages."
 3. Function inlining has generally been shown to be of tremendous
 value in optimization. D has access to all the source code in the
 program, or at least as much as you're willing to show it, and can
 inline across modules.
Uh-oh. I'd avoid advertising this particular point until after a critical bug is fixed: https://d.puremagic.com/issues/show_bug.cgi?id=10985 Applies to all 3 compilers. Otherwise - it's spot on. D has many ways to be typically "faster then Cee" ;)
But cross-module inlining can't be done if you have shared libraries cause you can not know if the other module is in a shared library right?
I've no idea how would compiler do inlining of a shared library internals. I believe it can inline everything in the same library that is not exported, right? (Is that your point (1) ?)
 If you inlined such code it wouldn't get updated if the shared library
 was updated and you'd have two versions of the code around...
Then some critical stuff simply must not be called via shared library and better be translated to ... 0-argument templates. Use cases: std.ascii and not only that. Checking ASCII isAlpha as a _function call_ is madness.
 I see only 2 solution to avoid this:

 (1) If the source files are compiled at once it's safe to assume they
      must be part of the same library and inlining is safe
 (2) The linker of course knows how objects fit together, so LTO.
-- Dmitry Olshansky
Dec 07 2013
parent Johannes Pfau <nospam example.com> writes:
Am Sat, 07 Dec 2013 23:37:08 +0400
schrieb Dmitry Olshansky <dmitry.olsh gmail.com>:

 But cross-module inlining can't be done if you have shared libraries
 cause you can not know if the other module is in a shared library
 right?
I've no idea how would compiler do inlining of a shared library internals. I believe it can inline everything in the same library that is not exported, right? (Is that your point (1) ?)
I think it can also inline stuff in the same library even if it's exported. But say for example libfoo is using libphobos and functions of libphobos are inlined into libfoo. Now there's a bugfix in libphobos but libfoo doesn't see it, as the function was inlined. You'd have to recompile libfoo as well. The difficult part is, how do you know if two source files end up in the same library? Say you compile foo.d which depends on foo2.d and std/algorithm.d and you have the full source code for all three. Now how do you know that you shouldn't inline from algorithm.d but you can inline from foo2.d? You can only be sure if foo.d and foo2.d are compiled at once. Or with LTO.
 
 If you inlined such code it wouldn't get updated if the shared
 library was updated and you'd have two versions of the code
 around...
Then some critical stuff simply must not be called via shared library and better be translated to ... 0-argument templates. Use cases: std.ascii and not only that. Checking ASCII isAlpha as a _function call_ is madness.
That solution is also often used in C: Use macros for small functions that always need to be inlined. It's not exactly pretty but I don't know a better solution.
Dec 07 2013
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Another thing to keep in account is that C is not much any more 
the golden standard for code performance. Today people that want 
to write fast code often use parallel algorithms using GPUs with 
CUDA/OpenCL (that look like C with extras), and when they are on 
CPUs they need to use all cores efficiently and SIMD units of the 
CPUs. (See the ideas of the Intel language compiled by ispc 
compiler: http://ispc.github.io/  This contains ideas useful to 
improve D language. D is lacking here).

Bye,
bearophile
Dec 06 2013
prev sibling next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Dec 06, 2013 at 02:20:22PM -0800, Walter Bright wrote:
 
 "there is no way proper C code can be slower than those languages."
 
   -- http://www.reddit.com/r/programming/comments/1s5ze3/benchmarking_d_vs_go_vs_erlang_vs_c_for_mqtt/cduwwoy
 
 comes up now and then. I think it's incorrect, D has many inherent
 advantages in generating code over C:
 
 1. D knows when data is immutable. C has to always make worst case
 assumptions, and assume indirectly accessed data mutates.
Does the compiler currently take advantage of this, e.g., in aliasing analysis?
 2. D knows when functions are pure. C has to make worst case assumptions.
Does the compiler currently take advantage of this?
 3. Function inlining has generally been shown to be of tremendous
 value in optimization. D has access to all the source code in the
 program, or at least as much as you're willing to show it, and can
 inline across modules. C cannot inline functions unless they appear
 in the same module or in .h files. It's a rare practice to push many
 functions into .h files. Of course, there are now linkers that can
 do whole program optimization for C, but those are kind of herculean
 efforts to work around that C limitation of being able to see only
 one module at a time.
To be frank, dmd's current inlining capabilities are rather disappointing. However, gdc IME has shown better track record in that area, so it looks promising.
 4. C strings are 0-terminated, D strings have a length property. The
 former has major negative performance consequences:
 
     a. lots of strlen()'s are necessary
Yeah, this is a common C bottleneck that most obsessively-optimizing C coders overlook. Many a time profiling of my "optimal" code turned up surprising bottlenecks, like fprintf's in the wrong place, or strlen's in inner loops. D's strings may be "heavier" (in the sense that they require a length field as opposed to a mere pointer -- y'know, the kind of stuff obsessively-optimizing C coders lose sleep over), but it comes with so many advantages that it's more than worth the price.
     b. using substrings usually requires a malloc/copy/free sequence
Yeah, string manipulation in C (and to a great extent C++) is a royal pain. It actually drove me to use Perl for non-performance critical string manipulating code for quite a good number of years. The heavy syntax required for using a regex library in C, the memory management nightmare of keeping track of substrings, all contribute to fragile code, increased development times, and poorer performance. Only obsessively-optimizing C coders can fail to see this (and I used to be among them). ;-)
 5. CTFE can push a lot of computation to compile time rather than
 run time. This has had spectacular positive performance consequences
 for things like regex. C has no CTFE ability.
D's compile-time capabilities were one of the major factors in my adoption of D. The fact that it allows one to write regex code with nice syntax yet zero runtime overhead is awesome. One of my favorite demonstrations of D's compile-time capabilities is in TDPL, where Andrei described a PRNG generator that checks against poor PRNG parameters at compile-time, and stops compilation if the checks fail. So if it compiles, you're guaranteed the PRNG is good. In C? Well you google for prime numbers and stick randomly-chosen ones into your PRNG parameters, and cross your fingers and hope for the best... (Or at best, the code will do the checks at runtime and abort, but honestly, *what* C coder would do such a thing?)
 6. D's array slicing coupled with GC means that many
 malloc/copy/free's normally done in C are unnecessary in D.
It does mean we need to get our act together and improve the GC's performance, though. ;-) But this is a major advantage in array manipulation (esp. string manipulation) in D. In equivalent C code, best practices dictate that slicing should always be done by allocating a new array and copying, because otherwise you introduce intricate dependencies between pointers and your code either becomes wrong (leaks memory / has dangling pointers), or extremely convoluted (reinvent the GC with reference counting, etc.). Being able to freely slice any array you want without needing to keep track of every single pointer, is a big savings in development time, not to mention better runtime performance because you don't have to keep allocating & copying. All that copying does add up! There's also another aspect to this: being built into the language, D slices allows different pieces of code to freely exchange them without needing to worry about memory management. In equivalent C code, an efficient implementation of slices would require the use of custom structs (as a pair of pointers, or a pointer + length ala D), or specific function parameter conventions (e.g. always pass pointer + length). Since everyone will implement this in a slightly different way, it makes interoperating different bits of code a pain. One function takes a pointer + length, another function takes a pointer pair, a third function takes a custom struct that encapsulates the foregoing, and a fourth function takes *another* custom struct that does the same thing but with a different implementation -- so now to make the code work together, you have to insert intermediate layers of code to convert between the representations. It's a big development time sink, and yet another corner for bugs to hide in. In D, since slices are built-in, everybody takes a slice, no interconversion is necessary, and everybody is happy.
 7. D's "final switch" enables more efficient switch code generation,
 because the default doesn't have to be considered.
Not to mention the maintenance advantages of final switch: modify the enum, and the compiler automatically points you to all the places in the code where the new enum value has to be handled. In C, you just modify the enum, and everything still compiles, except now things don't quite work until you add the new value to every switch statement that needs to handle it. But it's easy to miss one or two places where this is needed, so you have latent bugs that only show up through thorough testing. (And everybody knows that C codebases always come with thorough unittests, right? *ahem* More likely, these bugs go unnoticed until it blows up in a customer's production environment, then you have to pull an all-nighter to track down what's causing it.) It's not surprising that this led to the belief that switch statements are evil, as any OO aficionado would tell you. Well, *I* say that they're evil only because C's implementation of them is so primitive. In D, they can actually be a good thing! (Of course, you can still screw yourself over in D by inserting empty default cases into the wrong switch statements, but you have to actively shoot yourself in the foot to do it, in which case it's really your own fault, not the language's.) // And I'd also add 8.: most applications can get by with using a GC, and this actually lets you write more efficient code. One particular example I have in mind is a parse function that returns an AST. In C, the AST nodes would be malloc'd, of course, and then everyone downstream will have to make sure they call free on each and every AST node in order to avoid memory leakage. In the first version of the code, this is no biggie: just write a function called freeAST() or some such, and make sure everyone calls it. But then you want some parts of the code to hold on to bits of the AST -- say, store a function body with its corresponding symbol table entry -- and now you have to either recursively copy the AST (inefficient), or transplant it from the main AST and make sure you NULL the pointers in the parent tree so that it doesn't get double-freed, or some other such manual technique (very error-prone). But what if *two* places need to hold on to the same AST subtree? Conventional C wisdom says, duplicate it, since otherwise you'll have a major headache later on trying to keep track of when you can free something. Or reinvent the GC with reference-counting, etc., which you'll eventually have to once parts of the trees start cross-referencing each other. And later yet, you add error-handling, and now you have to make sure every time an error occurs, you call freeAST() on partially-constructed AST subtrees, and soon your initially simple code turns into a spaghetti of "if(error) goto cleanupAST". Miss just *one* of these cases, and you've a memory leak. In D? Just allocate the nodes and return them freely, and let them cross-reference them as much as they like -- the GC will clean up after you. Faster to code, less error-prone, and more performant (avoids duplicating AST subtrees, avoids complicated bookkeeping to keep track of when a node can be freed, etc.). Sure the GC will have to pause your program to run its collection cycles -- but that's not much different from a recursive free() of an AST subtree, which doesn't guarantee an upper-bound on running time (if your tree is 1 million nodes, then it has to do 1 million free's, right then, right there, and your program's gonna wait for that to finish, no matter what -- the thought that manual memory management is pause-free is a myth). It's true that manual memory management targeted for your specific application will always outperform a generic GC, since it can take advantage of your application's specific usage patterns. But 95% of the time, you don't *need* this kind of tweaking; yet in the C world, you have no choice but to manually manage everything. And chances are, your ad hoc implementation will perform poorer than the GC. After all, not everybody is an expert at writing memory management code. And, in a team project, expect people to screw up and introduce poor performance and/or pointer bugs, even if you've tweaked the thing to perfection. Besides, it's so much effort for only a little gain. C coders who have never used a GC'd language don't know what they're missing, when it comes to ease of dealing with complex cross-referencing data structures. T -- Indifference will certainly be the downfall of mankind, but who cares? -- Miquel van Smoorenburg
Dec 06 2013
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/6/2013 3:39 PM, H. S. Teoh wrote:
 1. D knows when data is immutable. C has to always make worst case
 assumptions, and assume indirectly accessed data mutates.
Does the compiler currently take advantage of this, e.g., in aliasing analysis?
I'm pretty sure dmd does, don't know about others.
 2. D knows when functions are pure. C has to make worst case assumptions.
Does the compiler currently take advantage of this?
dmd does.
 3. Function inlining has generally been shown to be of tremendous
 value in optimization. D has access to all the source code in the
 program, or at least as much as you're willing to show it, and can
 inline across modules. C cannot inline functions unless they appear
 in the same module or in .h files. It's a rare practice to push many
 functions into .h files. Of course, there are now linkers that can
 do whole program optimization for C, but those are kind of herculean
 efforts to work around that C limitation of being able to see only
 one module at a time.
To be frank, dmd's current inlining capabilities are rather disappointing. However, gdc IME has shown better track record in that area, so it looks promising.
This is about inherent language opportunities, not whether current implementations fall short or not.
 4. C strings are 0-terminated, D strings have a length property. The
 former has major negative performance consequences:

      a. lots of strlen()'s are necessary
Yeah, this is a common C bottleneck that most obsessively-optimizing C coders overlook. Many a time profiling of my "optimal" code turned up surprising bottlenecks, like fprintf's in the wrong place, or strlen's in inner loops. D's strings may be "heavier" (in the sense that they require a length field as opposed to a mere pointer -- y'know, the kind of stuff obsessively-optimizing C coders lose sleep over), but it comes with so many advantages that it's more than worth the price.
Yup.
      b. using substrings usually requires a malloc/copy/free sequence
Yeah, string manipulation in C (and to a great extent C++) is a royal pain. It actually drove me to use Perl for non-performance critical string manipulating code for quite a good number of years. The heavy syntax required for using a regex library in C, the memory management nightmare of keeping track of substrings, all contribute to fragile code, increased development times, and poorer performance. Only obsessively-optimizing C coders can fail to see this (and I used to be among them). ;-)
I failed to recognize the problem for what it was for years, too. I think the 0 terminated string issue is a severe fault in C, and it was carried over into C++'s std::string. C++ missed a big opportunity there.
 6. D's array slicing coupled with GC means that many
 malloc/copy/free's normally done in C are unnecessary in D.
It does mean we need to get our act together and improve the GC's performance, though. ;-)
The best memory allocation algorithm is one that doesn't do allocations at all. While this may sound trite, I think it is a crucial insight. The mere existence of the GC enables many allocations to not be done. The other thing is that D's ranges can also be used to eliminate a lot of low level allocations.
 But this is a major advantage in array manipulation (esp. string
 manipulation) in D. In equivalent C code, best practices dictate that
 slicing should always be done by allocating a new array and copying,
 because otherwise you introduce intricate dependencies between pointers
 and your code either becomes wrong (leaks memory / has dangling
 pointers), or extremely convoluted (reinvent the GC with reference
 counting, etc.). Being able to freely slice any array you want without
 needing to keep track of every single pointer, is a big savings in
 development time, not to mention better runtime performance because you
 don't have to keep allocating & copying. All that copying does add up!
Exactly.
 There's also another aspect to this: being built into the language, D
 slices allows different pieces of code to freely exchange them without
 needing to worry about memory management. In equivalent C code, an
 efficient implementation of slices would require the use of custom
 structs (as a pair of pointers, or a pointer + length ala D), or
 specific function parameter conventions (e.g. always pass pointer +
 length). Since everyone will implement this in a slightly different way,
 it makes interoperating different bits of code a pain. One function
 takes a pointer + length, another function takes a pointer pair, a third
 function takes a custom struct that encapsulates the foregoing, and a
 fourth function takes *another* custom struct that does the same thing
 but with a different implementation -- so now to make the code work
 together, you have to insert intermediate layers of code to convert
 between the representations. It's a big development time sink, and yet
 another corner for bugs to hide in. In D, since slices are built-in,
 everybody takes a slice, no interconversion is necessary, and everybody
 is happy.
Yup.
Dec 06 2013
parent reply Jacob Carlborg <doob me.com> writes:
On 2013-12-07 00:56, Walter Bright wrote:

 2. D knows when functions are pure. C has to make worst case
 assumptions.
Does the compiler currently take advantage of this?
dmd does.
Compiling the following code: pure int foo (immutable int a, immutable int b) { return a + b; } void main () { auto a = foo(1, 2); auto b = foo(1, 2); auto c = a + b; } With DMD 2.064.2 produce the exact same assembly code for "foo" and "main" with our without "pure". I compiled with "dmd -O -release foo.d", am I doing something wrong?
 This is about inherent language opportunities, not whether current
 implementations fall short or not.
I think most people will care about what's working right now. Not what could possibly work sometime in the future. -- /Jacob Carlborg
Dec 09 2013
next sibling parent reply Iain Buclaw <ibuclaw gdcproject.org> writes:
On 9 December 2013 08:05, Jacob Carlborg <doob me.com> wrote:
 On 2013-12-07 00:56, Walter Bright wrote:

 2. D knows when functions are pure. C has to make worst case
 assumptions.
Does the compiler currently take advantage of this?
dmd does.
Compiling the following code: pure int foo (immutable int a, immutable int b) { return a + b; } void main () { auto a = foo(1, 2); auto b = foo(1, 2); auto c = a + b; } With DMD 2.064.2 produce the exact same assembly code for "foo" and "main" with our without "pure". I compiled with "dmd -O -release foo.d", am I doing something wrong?
Of course, it will not work unless you also pass the following options: -version=Enterprise -noboundscheck -inline -property -transition=11629 I thought everyone knew that??!?? :o)
Dec 09 2013
parent Jacob Carlborg <doob me.com> writes:
On 2013-12-09 09:15, Iain Buclaw wrote:

 Of course, it will not work unless you also pass the following options:
 -version=Enterprise -noboundscheck -inline -property -transition=11629

 I thought everyone knew that??!?? :o)
Still no difference ;) -- /Jacob Carlborg
Dec 09 2013
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/9/2013 12:05 AM, Jacob Carlborg wrote:
 With DMD 2.064.2 produce the exact same assembly code for "foo" and "main" with
 our without "pure". I compiled with "dmd -O -release foo.d", am I doing
 something wrong?
Try this: pure int foo (int a, int b) nothrow { return a + b; } int test() { return foo(1, 2) + foo(1, 2); } Compile with: dmd foo -O -release -c And dump the assembly: push EAX mov EAX,2 push 1 call near ptr _D3foo3fooFNaNbiiZi add EAX,EAX pop ECX ret Granted, more cases can be done, but it *does* take advantage of purity here and there.
 This is about inherent language opportunities, not whether current
 implementations fall short or not.
I think most people will care about what's working right now. Not what could possibly work sometime in the future.
One cannot do better optimization if the language does not allow it.
Dec 09 2013
next sibling parent Jacob Carlborg <doob me.com> writes:
On 2013-12-09 09:45, Walter Bright wrote:

 Try this:

    pure int foo (int a, int b) nothrow {
      return a + b;
    }

    int test() {
      return foo(1, 2) + foo(1, 2);
    }

 Compile with:

    dmd foo -O -release -c

 And dump the assembly:

              push    EAX
              mov     EAX,2
              push    1
              call    near ptr _D3foo3fooFNaNbiiZi
              add     EAX,EAX
              pop     ECX
              ret

 Granted, more cases can be done, but it *does* take advantage of purity
 here and there.
Yeah, that worked. I tried to do as simple case I could think of. -- /Jacob Carlborg
Dec 09 2013
prev sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Dec 09, 2013 at 12:45:37AM -0800, Walter Bright wrote:
[...]
 Try this:
 
   pure int foo (int a, int b) nothrow {
     return a + b;
   }
 
   int test() {
     return foo(1, 2) + foo(1, 2);
   }
 
 Compile with:
 
   dmd foo -O -release -c
 
 And dump the assembly:
 
             push    EAX
             mov     EAX,2
             push    1
             call    near ptr _D3foo3fooFNaNbiiZi
             add     EAX,EAX
             pop     ECX
             ret
 
 Granted, more cases can be done, but it *does* take advantage of
 purity here and there.
[...] Does this currently only happen inside a single expression? Any plans to have the optimizer recognize pure calls in separate (but identical) expressions? Jacob's example seems like it *should* be easily detected: int a = pureFunc(a,b,c); int b = pureFunc(a,b,c); // should elide this second call T -- The only difference between male factor and malefactor is just a little emptiness inside.
Dec 09 2013
parent Walter Bright <newshound2 digitalmars.com> writes:
On 12/9/2013 10:50 AM, H. S. Teoh wrote:
 Does this currently only happen inside a single expression? Any plans to
 have the optimizer recognize pure calls in separate (but identical)
 expressions? Jacob's example seems like it *should* be easily detected:

 	int a = pureFunc(a,b,c);
 	int b = pureFunc(a,b,c); // should elide this second call
Certainly more can be done, as with many other optimizations, but I haven't been able to spend much time on it.
Dec 09 2013
prev sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
H. S. Teoh:

(if your tree is 1 million nodes, then it
 has to do 1 million free's, right then, right there,
In practice real C programs use arenas and pools to allocate the nodes from. This sometimes doubles the performance of C code that has to allocate many nodes of a tree data structure. A simple example: http://rosettacode.org/wiki/Self-referential_sequence#Faster_Low-level_Version Some of such code will become useless once Phobos has Andrei allocators :-) In C sometimes you also use hierarchical memory allocation, to simplify the memory management currently supported by Andrei allocators. Bye, bearophile
Dec 06 2013
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Dec 07, 2013 at 12:56:48AM +0100, bearophile wrote:
 H. S. Teoh:
 
(if your tree is 1 million nodes, then it has to do 1 million free's,
right then, right there,
In practice real C programs use arenas and pools to allocate the nodes from. This sometimes doubles the performance of C code that has to allocate many nodes of a tree data structure.
The problem with this in C, is that the code has to be designed to work with that particular arena/pool implementation that you're using. This makes interoperating between libraries a pain, and usually this means you can't use a lot of libraries, and you have to reinvent a lot of code just so they will work with the pool implementation.
 A simple example:
 
 http://rosettacode.org/wiki/Self-referential_sequence#Faster_Low-level_Version
 
 Some of such code will become useless once Phobos has Andrei
 allocators :-)
 
 In C sometimes you also use hierarchical memory allocation, to
 simplify the memory management

 currently supported by Andrei allocators.
[...] Yes, but again, this requires the code to be written to use hierarchical memory allocation. So you can't use a library that doesn't support it (well, you can, but it will not have good performance). There's a lot of advantages to having a standard memory allocation scheme built into the language (or at least, endorsed by the language). People don't often think about this, but a lot of overhead comes from interfacing between libraries of incompatible APIs / memory allocation schemes. Having a common scheme for everybody helps a lot, by eliminating the need for interfacing between them, or the need to reinvent the wheel because some library is incompatible with your custom memory allocator. T -- By understanding a machine-oriented language, the programmer will tend to use a much more efficient method; it is much closer to reality. -- D. Knuth
Dec 06 2013
prev sibling next sibling parent reply "qznc" <qznc web.de> writes:
On Friday, 6 December 2013 at 22:20:19 UTC, Walter Bright wrote:
 "there is no way proper C code can be slower than those 
 languages."
 http://www.reddit.com/r/programming/comments/1s5ze3/benchmarking_d_vs_go_vs_erlang_vs_c_for_mqtt/cduwwoy

 comes up now and then. I think it's incorrect, D has many 
 inherent advantages in generating code over C:
Good choice of words. The competent C programmer is able to perform comparable optimizations at least in the hot spots. D gives you a few of those for free (garbage collection,slices) and makes some things significantly easier (templates). However, I think the original statement is also true in the technical sense. The same argument can be made with assembly. It is impossible to beat "proper" hand-written asm, where proper means "only theoretically possible". In practice I agree with you that optimizing a D program should be easier than optimizing a C program.
Dec 07 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/7/2013 12:13 AM, qznc wrote:
 However, I think the original statement is also true in the technical sense.
The
 same argument can be made with assembly. It is impossible to beat "proper"
 hand-written asm, where proper means "only theoretically possible". In practice
 I agree with you that optimizing a D program should be easier than optimizing a
 C program.
"there is no way proper C code can be slower than those languages." It's the qualifier "proper". You say that means theoretically possible, I disagree. I suggest that proper C code means code that is presentable, maintainable and professionally written using commonly accepted best practices. For example, I've seen metaprogramming done in C using the preprocessor. It works, but I consider the result to be not presentable, not maintainable, and unprofessional and not a best practice. For another, Maxim suggested that it was easy in C to use D style length-delimited strings instead of 0 terminated ones. It's certainly theoretically possible to write such a string type in C, but it ain't easy and your result will be completely out of step with anyone else's C code and C libraries, which is why people don't do it. For another, how many times have you seen bubble sort reimplemented in C code? How about the obvious implementation of string searching? etc.? I've seen that stuff a lot. But in D, using a best-of-breed implementation of quicksort is easy as pie, same with searching, etc. These kinds of things also make D faster. I've translated C code into D before and gotten it to run faster by doing these sorts of plug-in algorithm replacements.
Dec 07 2013
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 12/07/2013 09:41 AM, Walter Bright wrote:

 "there is no way proper C code can be slower than those languages."
...
I think that statement is correct, but fully irrelevant. http://en.wikipedia.org/wiki/No_true_Scotsman
 It's the qualifier "proper". You say that means theoretically possible,
 I disagree. I suggest that proper C code means code that is presentable,
 maintainable and professionally written using commonly accepted best
 practices. ...
I suggest what is meant by "proper" is "faster than any implementation in those languages". :)
Dec 07 2013
parent "Jesse Phillips" <Jesse.K.Phillips+D gmail.com> writes:
On Saturday, 7 December 2013 at 09:55:10 UTC, Timon Gehr wrote:
 I suggest what is meant by "proper" is "faster than any 
 implementation in those languages". :)
Exactly, proper C is anything that runs as fast as possible, screw presentation and maintainability, that isn't C.
Dec 07 2013
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 07/12/13 09:41, Walter Bright wrote:
 For another, how many times have you seen bubble sort reimplemented in C code?
 How about the obvious implementation of string searching? etc.? I've seen that
 stuff a lot. But in D, using a best-of-breed implementation of quicksort is
easy
 as pie, same with searching, etc. These kinds of things also make D faster.
I've
 translated C code into D before and gotten it to run faster by doing these
sorts
 of plug-in algorithm replacements.
Conversely, where it seems necessary, it's always possible to write D code in a "C-like", very detailed imperative style that really takes micro control of how something is implemented. However, that can usually be hidden away inside a function so that the end user doesn't need to be bothered by it. With C you're pretty much obliged to write complicated code in many situations. With D, even where you need to write like this it's usually _less_ complicated (less boilerplate etc.) and you only have to break it out where it's really, really necessary.
Dec 07 2013
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/7/13 12:41 AM, Walter Bright wrote:
 But in D, using a best-of-breed
 implementation of quicksort is easy as pie
ahem :o) Andrei
Dec 07 2013
prev sibling next sibling parent reply "Araq" <rumpf_a web.de> writes:
On Friday, 6 December 2013 at 22:20:19 UTC, Walter Bright wrote:
 "there is no way proper C code can be slower than those 
 languages."

   -- 
 http://www.reddit.com/r/programming/comments/1s5ze3/benchmarking_d_vs_go_vs_erlang_vs_c_for_mqtt/cduwwoy

 comes up now and then. I think it's incorrect, D has many 
 inherent advantages in generating code over C:

 1. D knows when data is immutable. C has to always make worst 
 case assumptions, and assume indirectly accessed data mutates.

 2. D knows when functions are pure. C has to make worst case 
 assumptions.

 3. Function inlining has generally been shown to be of 
 tremendous value in optimization. D has access to all the 
 source code in the program, or at least as much as you're 
 willing to show it, and can inline across modules. C cannot 
 inline functions unless they appear in the same module or in .h 
 files. It's a rare practice to push many functions into .h 
 files. Of course, there are now linkers that can do whole 
 program optimization for C, but those are kind of herculean 
 efforts to work around that C limitation of being able to see 
 only one module at a time.

 4. C strings are 0-terminated, D strings have a length 
 property. The former has major negative performance 
 consequences:

     a. lots of strlen()'s are necessary

     b. using substrings usually requires a malloc/copy/free 
 sequence

 5. CTFE can push a lot of computation to compile time rather 
 than run time. This has had spectacular positive performance 
 consequences for things like regex. C has no CTFE ability.

 6. D's array slicing coupled with GC means that many 
 malloc/copy/free's normally done in C are unnecessary in D.

 7. D's "final switch" enables more efficient switch code 
 generation, because the default doesn't have to be considered.
From this list only (7) is a valid point. All the others can be trivially dealt with whole program optimization (1,2,3) or coding conventions (4,5,6) (always pass a (char*, len) pair around for efficient slicing). Interestingly, things that are encouraged in Ada (this is an array of integers of range 0..30, see value range propagation) are much harder to recompute with whole program optimization and D lacks them.
Dec 08 2013
next sibling parent reply "John Colvin" <john.loughran.colvin gmail.com> writes:
On Sunday, 8 December 2013 at 10:13:58 UTC, Araq wrote:
 On Friday, 6 December 2013 at 22:20:19 UTC, Walter Bright wrote:
 "there is no way proper C code can be slower than those 
 languages."

  -- 
 http://www.reddit.com/r/programming/comments/1s5ze3/benchmarking_d_vs_go_vs_erlang_vs_c_for_mqtt/cduwwoy

 comes up now and then. I think it's incorrect, D has many 
 inherent advantages in generating code over C:

 1. D knows when data is immutable. C has to always make worst 
 case assumptions, and assume indirectly accessed data mutates.

 2. D knows when functions are pure. C has to make worst case 
 assumptions.

 3. Function inlining has generally been shown to be of 
 tremendous value in optimization. D has access to all the 
 source code in the program, or at least as much as you're 
 willing to show it, and can inline across modules. C cannot 
 inline functions unless they appear in the same module or in 
 .h files. It's a rare practice to push many functions into .h 
 files. Of course, there are now linkers that can do whole 
 program optimization for C, but those are kind of herculean 
 efforts to work around that C limitation of being able to see 
 only one module at a time.

 4. C strings are 0-terminated, D strings have a length 
 property. The former has major negative performance 
 consequences:

    a. lots of strlen()'s are necessary

    b. using substrings usually requires a malloc/copy/free 
 sequence

 5. CTFE can push a lot of computation to compile time rather 
 than run time. This has had spectacular positive performance 
 consequences for things like regex. C has no CTFE ability.

 6. D's array slicing coupled with GC means that many 
 malloc/copy/free's normally done in C are unnecessary in D.

 7. D's "final switch" enables more efficient switch code 
 generation, because the default doesn't have to be considered.
 coding conventions (4,5,6) (always pass a (char*, len) pair 
 around for efficient slicing).
How does a coding convention allow you to create a high-performance regex engine at compile time? How does it allow you to do pretty much any of what CTFE can do?
 Interestingly, things that are encouraged in Ada (this is an 
 array of integers of range 0..30, see value range propagation) 
 are much harder to recompute with whole program optimization 
 and D lacks them.
Agreed.
Dec 08 2013
parent "Araq" <rumpf_a web.de> writes:
 How does a coding convention allow you to create a 
 high-performance regex engine at compile time? How does it 
 allow you to do pretty much any of what CTFE can do?
Well you can always pretend the output of re2c or flex was your hand-written C code. ;-) But sure, "code conventions" is not the proper term for what I mean here.
Dec 08 2013
prev sibling next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Araq:

 Interestingly, things that are encouraged in Ada (this is an 
 array of integers of range 0..30, see value range propagation) 
 are much harder to recompute with whole program optimization 
 and D lacks them.
I am currently thinking about related topics. What do you mean? I don't understand. Bye, bearophile
Dec 08 2013
parent reply "Araq" <rumpf_a web.de> writes:
On Sunday, 8 December 2013 at 11:17:15 UTC, bearophile wrote:
 Araq:

 Interestingly, things that are encouraged in Ada (this is an 
 array of integers of range 0..30, see value range propagation) 
 are much harder to recompute with whole program optimization 
 and D lacks them.
I am currently thinking about related topics. What do you mean? I don't understand. Bye, bearophile
Well that "int[]" is in fact really an "array of int of range 0..30" is a property of a type; pureness or what a function modifies is a property of a function. Properties of types are inherently more difficult to infer than properties of functions.
Dec 08 2013
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 12/08/2013 03:13 PM, Araq wrote:
 On Sunday, 8 December 2013 at 11:17:15 UTC, bearophile wrote:
 Araq:

 Interestingly, things that are encouraged in Ada (this is an array of
 integers of range 0..30, see value range propagation) are much harder
 to recompute with whole program optimization and D lacks them.
I am currently thinking about related topics. What do you mean? I don't understand. Bye, bearophile
Well that "int[]" is in fact really an "array of int of range 0..30" is a property of a type; pureness or what a function modifies is a property of a function. Properties of types are inherently more difficult to infer than properties of functions.
I don't understand this point. Functions have types as well. Are you referring to mutability?
Dec 08 2013
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/8/2013 2:13 AM, Araq wrote:
 From this list only (7) is a valid point. All the others can be trivially dealt
 with whole program optimization (1,2,3)
If it's trivial, it's not happening. (1) would require solving the halting problem. (2) is impractical because there's no way for the programmer to detect if his call stack is pure or not, so he can't reasonably fix it to make it pure. (3) does require whole program analysis, which is done (as I said) by the linker, and is not normally done. Will gcc/ld do it? Nope. (It's not so trivial.)
 or coding conventions (4,5,6) (always
 pass a (char*, len) pair around for efficient slicing).
(4) just try retrofitting an existing C program to do this. I have. You'll give it up soon :-) (5) it just doesn't happen in C code - it's too hard (6) I pretty much never see this happening in real C code - it's not so trivial
 Interestingly, things
 that are encouraged in Ada (this is an array of integers of range 0..30, see
 value range propagation) are much harder to recompute with whole program
 optimization and D lacks them.
You can do these in D with a library type.
Dec 08 2013
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 12/08/2013 07:53 PM, Walter Bright wrote:
 On 12/8/2013 2:13 AM, Araq wrote:
 From this list only (7) is a valid point. All the others can be
 trivially dealt
 with whole program optimization (1,2,3)
If it's trivial, it's not happening. (1) would require solving the halting problem. ...
No it would not. If all you need to do is catching up with D, just infer where immutability qualifiers should go. This is decidable (the search space is finite). Of course, an analysis of this style could infer more fine-grained global aliasing information anyway.
Dec 08 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/8/2013 12:06 PM, Timon Gehr wrote:
 On 12/08/2013 07:53 PM, Walter Bright wrote:
 On 12/8/2013 2:13 AM, Araq wrote:
 From this list only (7) is a valid point. All the others can be
 trivially dealt
 with whole program optimization (1,2,3)
If it's trivial, it's not happening. (1) would require solving the halting problem. ...
No it would not. If all you need to do is catching up with D, just infer where immutability qualifiers should go. This is decidable (the search space is finite). Of course, an analysis of this style could infer more fine-grained global aliasing information anyway.
I don't believe it is remotely correct that any amount of static analysis will determine where a pointer points to, just as it won't tell you what value the variable 'i' has in it.
Dec 08 2013
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 12/08/2013 09:42 PM, Walter Bright wrote:
 On 12/8/2013 12:06 PM, Timon Gehr wrote:
 On 12/08/2013 07:53 PM, Walter Bright wrote:
 On 12/8/2013 2:13 AM, Araq wrote:
 From this list only (7) is a valid point. All the others can be
 trivially dealt
 with whole program optimization (1,2,3)
If it's trivial, it's not happening. (1) would require solving the halting problem. ...
No it would not. If all you need to do is catching up with D, just infer where immutability qualifiers should go. This is decidable (the search space is finite). Of course, an analysis of this style could infer more fine-grained global aliasing information anyway.
I don't believe it is remotely correct that any amount of static analysis will determine where a pointer points to, just as it won't tell you what value the variable 'i' has in it.
? Static analysis will tell you where a pointer might point to (more importantly, it may exclude aliasing) and what values the variable 'i' might have in it. How precise this information is hinges on the details of the analysis and the program it is applied on. I'm just saying that this may be more precise than what 'immutable' qualifiers give you (since those are quite easy to infer, given the program, and alias analysis may be non-trivial.)
Dec 08 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/8/2013 4:47 PM, Timon Gehr wrote:
 Static analysis will tell you where a pointer might point to (more importantly,
 it may exclude aliasing) and what values the variable 'i' might have in it. How
 precise this information is hinges on the details of the analysis and the
 program it is applied on. I'm just saying that this may be more precise than
 what 'immutable' qualifiers give you (since those are quite easy to infer,
given
 the program, and alias analysis may be non-trivial.)
There has been a lot of work trying to eliminate array bounds checking by figuring out the limits of i. This has met with only limited success. I know of no C compiler that even attempts such things, and it certainly would be done if it was trivial. "Proper" D code makes routine use of const and immutable, and so that information is effortlessly available to even simple optimizers.
Dec 08 2013
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 12/09/2013 02:05 AM, Walter Bright wrote:
 On 12/8/2013 4:47 PM, Timon Gehr wrote:
 Static analysis will tell you where a pointer might point to (more
 importantly,
 it may exclude aliasing) and what values the variable 'i' might have
 in it. How
 precise this information is hinges on the details of the analysis and the
 program it is applied on. I'm just saying that this may be more
 precise than
 what 'immutable' qualifiers give you (since those are quite easy to
 infer, given
 the program, and alias analysis may be non-trivial.)
There has been a lot of work trying to eliminate array bounds checking by figuring out the limits of i. This has met with only limited success. I know of no C compiler that even attempts such things, and it certainly would be done if it was trivial. "Proper" D code makes routine use of const and immutable, and so that information is effortlessly available to even simple optimizers.
Sorry, I'm lost. What point are you arguing? None of this disputes in any way anything I wrote.
Dec 09 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/9/2013 1:28 PM, Timon Gehr wrote:
 Sorry, I'm lost. What point are you arguing? None of this disputes in any way
 anything I wrote.
I thought you were arguing that whole program analysis was as good as using immutable and const qualifiers.
Dec 09 2013
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 12/09/2013 11:00 PM, Walter Bright wrote:
 On 12/9/2013 1:28 PM, Timon Gehr wrote:
 Sorry, I'm lost. What point are you arguing? None of this disputes in
 any way
 anything I wrote.
I thought you were arguing that whole program analysis was as good as using immutable and const qualifiers.
Indeed. Recovering all immutable and const qualifiers that are possible to assign to the program is simple to do if the whole program is available. (I was neither arguing that static analysis will eliminate all array bounds checks, nor that integrating such information into an existing C back-end is trivial, nor that manually adding the information does not mean that it does not have to be inferred.)
Dec 10 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/10/2013 1:58 AM, Timon Gehr wrote:
 Recovering all immutable and const qualifiers that are possible to
 assign to the program is simple to do if the whole program is available.
I believe you are seriously mistaken about it being simple, or even possible. For example, malloc() returns a pointer. How do you know if that returns a pointer to immutable data or not? Even if it could look at the source code to malloc()?
Dec 10 2013
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 12/10/2013 07:26 PM, Walter Bright wrote:
 On 12/10/2013 1:58 AM, Timon Gehr wrote:
 Recovering all immutable and const qualifiers that are possible to
 assign to the program is simple to do if the whole program is available.
I believe you are seriously mistaken about it being simple, or even possible. ...
Unfortunately I'm not currently in a position to invest any time into implementing mutability analysis for C code. Maybe later. It might prove interesting to see how well a straightforward inference of D-style qualifiers works for existing code bases. (But this is not the relevant question here since we are talking about _inherent_ advantages.)
 For example, malloc() returns a pointer. How do you know if that returns
 a pointer to immutable data or not? Even if it could look at the source
 code to malloc()?
Malloc is part of the language runtime. Everything needed is known about it, in particular that it is pure (in the D sense). Also, the source code of malloc will not be standard C code. In any case, please understand that giving even a valid example of a case where such an approach cannot do well (e.g. all variables under consideration are actually mutated) does not invalidate the statement made.
Dec 10 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/10/2013 3:04 PM, Timon Gehr wrote:
 Malloc is part of the language runtime. Everything needed is known about it, in
 particular that it is pure (in the D sense). Also, the source code of malloc
 will not be standard C code.
All right, so write your own storage allocator. How are you going to tell the C compiler that it's pure?
 In any case, please understand that giving even a valid example of a case where
 such an approach cannot do well (e.g. all variables under consideration are
 actually mutated) does not invalidate the statement made.
Pointers coming from things like allocators and data structures where no known compiler technoloty could deduce what is happening in it are the norm, not the exception.
Dec 10 2013
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 12/11/2013 12:46 AM, Walter Bright wrote:
 On 12/10/2013 3:04 PM, Timon Gehr wrote:
 Malloc is part of the language runtime. Everything needed is known
 about it, in
 particular that it is pure (in the D sense). Also, the source code of
 malloc
 will not be standard C code.
All right, so write your own storage allocator. How are you going to tell the C compiler that it's pure?
How about the D compiler?
Dec 14 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/14/2013 9:19 AM, Timon Gehr wrote:
 On 12/11/2013 12:46 AM, Walter Bright wrote:
 On 12/10/2013 3:04 PM, Timon Gehr wrote:
 Malloc is part of the language runtime. Everything needed is known
 about it, in
 particular that it is pure (in the D sense). Also, the source code of
 malloc
 will not be standard C code.
All right, so write your own storage allocator. How are you going to tell the C compiler that it's pure?
How about the D compiler?
You can cast the result to const/immutable, which is why the casting is possible. It tells the compiler things that cannot be deduced.
Dec 14 2013
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 12/14/2013 08:30 PM, Walter Bright wrote:
 On 12/14/2013 9:19 AM, Timon Gehr wrote:
 On 12/11/2013 12:46 AM, Walter Bright wrote:
 On 12/10/2013 3:04 PM, Timon Gehr wrote:
 Malloc is part of the language runtime. Everything needed is known
 about it, in
 particular that it is pure (in the D sense). Also, the source code of
 malloc
 will not be standard C code.
All right, so write your own storage allocator. How are you going to tell the C compiler that it's pure?
How about the D compiler?
You can cast the result to const/immutable, which is why the casting is possible. It tells the compiler things that cannot be deduced.
I cannot cast data from my own storage allocator to immutable because the behaviour will be undefined. http://dlang.org/const3.html Is this a documentation bug? What should be the actual rules?
Dec 14 2013
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/14/2013 4:36 PM, Timon Gehr wrote:
 I cannot cast data from my own storage allocator to immutable because the
 behaviour will be undefined.

 http://dlang.org/const3.html

 Is this a documentation bug? What should be the actual rules?
It means it's up to you to ensure it is correct.
Dec 14 2013
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 12/15/2013 02:20 AM, Walter Bright wrote:
 On 12/14/2013 4:36 PM, Timon Gehr wrote:
 I cannot cast data from my own storage allocator to immutable because the
 behaviour will be undefined.

 http://dlang.org/const3.html

 Is this a documentation bug? What should be the actual rules?
It means it's up to you to ensure it is correct.
Undefined behaviour is a term with a precise meaning. That site says that casting a reference to immutable while there are still live mutable references to memory reachable by that reference leads to undefined behaviour. It is not possible to 'ensure it is correct'.
Dec 15 2013
next sibling parent reply "Uplink_Coder" <someemail someprovider.some> writes:
On Sunday, 15 December 2013 at 11:52:17 UTC, Timon Gehr wrote:
 On 12/15/2013 02:20 AM, Walter Bright wrote:
 On 12/14/2013 4:36 PM, Timon Gehr wrote:
 I cannot cast data from my own storage allocator to immutable 
 because the
 behaviour will be undefined.

 http://dlang.org/const3.html

 Is this a documentation bug? What should be the actual rules?
It means it's up to you to ensure it is correct.
Undefined behaviour is a term with a precise meaning. That site says that casting a reference to immutable while there are still live mutable references to memory reachable by that reference leads to undefined behaviour. It is not possible to 'ensure it is correct'.
being picky with words is not the right way to argue :D anyhow. since we are programmers we can change meaning perfectly well <code> alias good bad; </code> there :D Undefined behaviour may have a precise meaning to a academic, but for me as a programmer it means. AVOID THIS SITUATION !!! unless you know what you do! Undefined behaviour for a compiler is a point where certin garuntees MAY be broken. casting something says : "Compiler my friend: Trust Me, I know what I do" and since neither the compiler nor the compiler-writer can know wether you are REALLY trustworthy it can't and doesn't define behaviour for that case. In this case you have to picture langauge as obeying the open-closed principle. The advice Walter gave was adding to the avilable Information not subsituting it.
Dec 15 2013
next sibling parent "jerro" <a a.com> writes:
 Undefined behaviour may have a precise meaning to a academic, 
 but
for me as a programmer it means. AVOID THIS SITUATION !!! unless you know what you do! And to the compiler writer it means: In this situation, you can do whatever the hell you want! Later, when you think you know what you are doing, it can turn out that what some specific compiler does in some specific situation is not even remotely what you were expecting it to do. For example, see gcc and its insane behavior when -fstrict-aliasing (turned on by default at -O2) is used.
Dec 15 2013
prev sibling next sibling parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Sunday, 15 December 2013 at 22:53:19 UTC, Uplink_Coder wrote:
 On Sunday, 15 December 2013 at 11:52:17 UTC, Timon Gehr wrote:
 On 12/15/2013 02:20 AM, Walter Bright wrote:
 On 12/14/2013 4:36 PM, Timon Gehr wrote:
 I cannot cast data from my own storage allocator to 
 immutable because the
 behaviour will be undefined.

 http://dlang.org/const3.html

 Is this a documentation bug? What should be the actual rules?
It means it's up to you to ensure it is correct.
Undefined behaviour is a term with a precise meaning. That site says that casting a reference to immutable while there are still live mutable references to memory reachable by that reference leads to undefined behaviour. It is not possible to 'ensure it is correct'.
being picky with words is not the right way to argue :D anyhow. since we are programmers we can change meaning perfectly well <code> alias good bad; </code> there :D Undefined behaviour may have a precise meaning to a academic, but for me as a programmer it means. AVOID THIS SITUATION !!! unless you know what you do! Undefined behaviour for a compiler is a point where certin garuntees MAY be broken. casting something says : "Compiler my friend: Trust Me, I know what I do" and since neither the compiler nor the compiler-writer can know wether you are REALLY trustworthy it can't and doesn't define behaviour for that case. In this case you have to picture langauge as obeying the open-closed principle. The advice Walter gave was adding to the avilable Information not subsituting it.
Strict, well-defined definitions are important for communication on technical subjects. Undefined behaviour means "goodbye logic". Excepting the extreme case of hacking something for a specific compiler version, you should never use it. By definition, you cannot know what you're doing. Having mutable references to immutable data is not undefined behaviour. The normal logic of the program is guaranteed to not be broken by them. The line lies at modifying the data, not at the existence of the reference.
Dec 16 2013
prev sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 12/15/2013 11:53 PM, Uplink_Coder wrote:
 On Sunday, 15 December 2013 at 11:52:17 UTC, Timon Gehr wrote:
 On 12/15/2013 02:20 AM, Walter Bright wrote:
 On 12/14/2013 4:36 PM, Timon Gehr wrote:
 I cannot cast data from my own storage allocator to immutable
 because the
 behaviour will be undefined.

 http://dlang.org/const3.html

 Is this a documentation bug? What should be the actual rules?
It means it's up to you to ensure it is correct.
Undefined behaviour is a term with a precise meaning. That site says that casting a reference to immutable while there are still live mutable references to memory reachable by that reference leads to undefined behaviour. It is not possible to 'ensure it is correct'.
being picky with words is not the right way to argue :D
Of course it is. http://en.wikipedia.org/wiki/Mathematical_logic Logic is the right way to reason whenever it is applicable. An example of when it is not applicable is during a debate on whether the last statement was valid. However, reasonable people will agree. Here the goal is to describe programming language semantics in a precise way, so logic is clearly applicable. Knowing what one is actually saying (relative to some logic) is necessary in order to do this in any meaningful fashion.
 anyhow.
 since we are programmers we can change meaning perfectly well
 <code>
 alias good bad;
 </code> there :D
I see neither your point nor how your example illustrates it.
 Undefined behaviour may have a precise meaning to a academic, but for me
 as a programmer it means. AVOID THIS SITUATION !!!
This is close enough to the "precise academic meaning" to leave everything I have stated meaningful.
 unless you know what you do!
 Undefined behaviour for a compiler is a point where certin garuntees MAY
 be broken. casting something says : "Compiler my friend: Trust Me, I
 know what I do"  and since neither the compiler nor the compiler-writer
 can know wether you are REALLY trustworthy it can't and doesn't define
 behaviour for that case.
This part is nonsense. (Also, we are not arguing about what some specific compiler is doing. A specific compiler version will usually define the semantics of a portion of code for each back-end architecture in a more fine-grained way than mandated by the language standard, specified in its documentation and/or unspecified.) The language defines behaviour conditional on "trustworthiness of the programmer". E.g. dereferencing a pointer in C has defined behaviour as long as the pointer is valid, according to a precise definition of valid. It is not the case that dereferencing a pointer in C never has defined behaviour just because one cannot verify validity automatically in general. This does not have to be possible in order to have a meaningful language specification.
 In this case you have to picture langauge as obeying the open-closed
 principle.
 The advice Walter gave was adding to the avilable Information not
 subsituting it.
Hence I was asking whether the spec is in need of an update.
Dec 16 2013
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/15/2013 3:52 AM, Timon Gehr wrote:
 On 12/15/2013 02:20 AM, Walter Bright wrote:
 On 12/14/2013 4:36 PM, Timon Gehr wrote:
 I cannot cast data from my own storage allocator to immutable because the
 behaviour will be undefined.

 http://dlang.org/const3.html

 Is this a documentation bug? What should be the actual rules?
It means it's up to you to ensure it is correct.
Undefined behaviour is a term with a precise meaning. That site says that casting a reference to immutable while there are still live mutable references to memory reachable by that reference leads to undefined behaviour. It is not possible to 'ensure it is correct'.
You, as the guy who wrote the code, will (or should) know that there are no other live references, hence you are telling the compiler "trust me, I know there aren't any".
Dec 15 2013
parent reply "jerro" <a a.com> writes:
 You, as the guy who wrote the code, will (or should) know that 
 there are no other live references, hence you are telling the 
 compiler "trust me, I know there aren't any".
So, is the intended meaning the following: If there exist any immutable references to the data, mutating it results in undefined behavior? If so, I think the documentation on creating immutable data should explicitly say so.
Dec 15 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/15/2013 4:22 PM, jerro wrote:
 You, as the guy who wrote the code, will (or should) know that there are no
 other live references, hence you are telling the compiler "trust me, I know
 there aren't any".
So, is the intended meaning the following: If there exist any immutable references to the data, mutating it results in undefined behavior? If so, I think the documentation on creating immutable data should explicitly say so.
Good idea. I suggest writing a pull request against the documentation for this.
Dec 15 2013
next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 12/16/2013 01:53 AM, Walter Bright wrote:
 On 12/15/2013 4:22 PM, jerro wrote:
 You, as the guy who wrote the code, will (or should) know that there
 are no
 other live references, hence you are telling the compiler "trust me,
 I know
 there aren't any".
So, is the intended meaning the following: If there exist any immutable references to the data, mutating it results in undefined behavior? If so, I think the documentation on creating immutable data should explicitly say so.
Good idea. I suggest writing a pull request against the documentation for this.
We need to be careful to do this right. Consider what happens when we want to dispose memory using our custom allocator. The caller clearly needs to own an immutable reference in order to cast it to mutable and pass it to the storage allocator, where it will be freed in some way that will sometimes involve updating some metadata accessible through it. The immutable reference will not be gone during this process. [1] Furthermore, existence of references may be indeterministic due to the garbage collector, and the language does not make any guarantees that anything is ever collected at all. Informally speaking, the final specification should probably allow in some way immutable references to mutated data to _exist_, but not define _access_ to them. [1] The language does not allow ownership to be tracked.
Dec 16 2013
prev sibling next sibling parent Marco Leise <Marco.Leise gmx.de> writes:
Am Sun, 15 Dec 2013 16:53:21 -0800
schrieb Walter Bright <newshound2 digitalmars.com>:

 On 12/15/2013 4:22 PM, jerro wrote:
 You, as the guy who wrote the code, will (or should) know that there are no
 other live references, hence you are telling the compiler "trust me, I know
 there aren't any".
So, is the intended meaning the following: If there exist any immutable references to the data, mutating it results in undefined behavior? If so, I think the documentation on creating immutable data should explicitly say so.
Good idea. I suggest writing a pull request against the documentation for this.
+1. I kept telling people that casting immutable to mutable and vice-versa is undefined behavior, based on that page. -- Marco
Dec 16 2013
prev sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
On Monday, 16 December 2013 at 00:53:21 UTC, Walter Bright wrote:
 On 12/15/2013 4:22 PM, jerro wrote:
 You, as the guy who wrote the code, will (or should) know 
 that there are no
 other live references, hence you are telling the compiler 
 "trust me, I know
 there aren't any".
So, is the intended meaning the following: If there exist any immutable references to the data, mutating it results in undefined behavior? If so, I think the documentation on creating immutable data should explicitly say so.
Good idea. I suggest writing a pull request against the documentation for this.
That is a bad idea as it preclude any GC optimization based on immutability.
Dec 16 2013
parent reply "Francesco Cattoglio" <francesco.cattoglio gmail.com> writes:
On Monday, 16 December 2013 at 17:32:11 UTC, deadalnix wrote:
 On Monday, 16 December 2013 at 00:53:21 UTC, Walter Bright
 Good idea. I suggest writing a pull request against the 
 documentation for this.
That is a bad idea as it preclude any GC optimization based on immutability.
What do you mean by this? How can a change in the documentation preclude GC optimization? Does our GC read docs to understand what can be collected? Now I get it: it's a stop the world, then RTFM, then collect GC implementation :)
Dec 16 2013
next sibling parent Marco Leise <Marco.Leise gmx.de> writes:
Am Mon, 16 Dec 2013 18:39:13 +0100
schrieb "Francesco Cattoglio" <francesco.cattoglio gmail.com>:

 On Monday, 16 December 2013 at 17:32:11 UTC, deadalnix wrote:
 On Monday, 16 December 2013 at 00:53:21 UTC, Walter Bright
 Good idea. I suggest writing a pull request against the 
 documentation for this.
That is a bad idea as it preclude any GC optimization based on immutability.
What do you mean by this? How can a change in the documentation preclude GC optimization? Does our GC read docs to understand what can be collected? Now I get it: it's a stop the world, then RTFM, then collect GC implementation :)
Think again(, please)! The more restricted the language becomes, the more room for specific optimizations there will be. When the written specs are liberal then some GC optimizations cannot be applied like compaction. -- Marco
Dec 16 2013
prev sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Monday, 16 December 2013 at 17:39:14 UTC, Francesco Cattoglio 
wrote:
 On Monday, 16 December 2013 at 17:32:11 UTC, deadalnix wrote:
 On Monday, 16 December 2013 at 00:53:21 UTC, Walter Bright
 Good idea. I suggest writing a pull request against the 
 documentation for this.
That is a bad idea as it preclude any GC optimization based on immutability.
What do you mean by this? How can a change in the documentation preclude GC optimization? Does our GC read docs to understand what can be collected? Now I get it: it's a stop the world, then RTFM, then collect GC implementation :)
The GC won't know about you bypassing the type system. It never will. It suddenly that isn't undefined behavior anymore, it become impossible for the GC to take advantage of type qualifier.
Dec 16 2013
prev sibling parent reply "Jesse Phillips" <Jesse.K.Phillips+D gmail.com> writes:
On Sunday, 15 December 2013 at 00:36:31 UTC, Timon Gehr wrote:
 I cannot cast data from my own storage allocator to immutable 
 because the behaviour will be undefined.

 http://dlang.org/const3.html

 Is this a documentation bug? What should be the actual rules?
Casting to immutable is defined, it is modifying the data which is not. That is to say, by casting to immutable the compiler cannot guarantee no mutation will occur. I assume you're confusion comes from: char[] s = ...; immutable(char)[] p = cast(immutable)s; // undefined behavior immutable(char)[] p = cast(immutable)s.dup; // ok, unique reference The docs should probably be cleaned up, but it isn't exactly incorrect, since a mutable reference exists the compiler can't really define what behavior will occur if the program is run because it can't guarantee what behavior exists.
Dec 14 2013
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 12/15/2013 02:20 AM, Jesse Phillips wrote:
 On Sunday, 15 December 2013 at 00:36:31 UTC, Timon Gehr wrote:
 I cannot cast data from my own storage allocator to immutable because
 the behaviour will be undefined.

 http://dlang.org/const3.html

 Is this a documentation bug? What should be the actual rules?
Casting to immutable is defined, it is modifying the data which is not. That is to say, by casting to immutable the compiler cannot guarantee no mutation will occur. I assume you're confusion comes from: char[] s = ...; immutable(char)[] p = cast(immutable)s; // undefined behavior immutable(char)[] p = cast(immutable)s.dup; // ok, unique reference The docs should probably be cleaned up, but it isn't exactly incorrect, since a mutable reference exists the compiler can't really define what behavior will occur if the program is run because it can't guarantee what behavior exists.
http://en.wikipedia.org/wiki/Humpty_Dumpty#In_Through_the_Looking-Glass
Dec 15 2013
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Walter Bright:

 You can do these in D with a library type.
We'll discuss much better about similar topics in another specific thread. Bye, bearophile
Dec 08 2013
prev sibling parent reply "Araq" <rumpf_a web.de> writes:
On Sunday, 8 December 2013 at 18:53:07 UTC, Walter Bright wrote:
 On 12/8/2013 2:13 AM, Araq wrote:
 From this list only (7) is a valid point. All the others can 
 be trivially dealt
 with whole program optimization (1,2,3)
If it's trivial, it's not happening. (1) would require solving the halting problem.
Well ok when I say "easy" or "trivial" it doesn't mean anything ;-). (1) doesn't require solving the halting problem at all though. For optimizations all you need to do is to come up with an analysis that approximates the runtime behaviour in a conservative approach. ("When in doubt, assume it modifies this location.")
 (2) is impractical because there's no way for the programmer to 
 detect if his call stack is pure or not, so he can't reasonably 
 fix it to make it pure.
Well look at the subject: "inherent" vs "practical".
 (3) does require whole program analysis, which is done (as I 
 said) by the linker, and is not normally done. Will gcc/ld do 
 it? Nope. (It's not so trivial.)
Both GCC and LLVM can perform link time optimizations.
 or coding conventions (4,5,6) (always
 pass a (char*, len) pair around for efficient slicing).
(4) just try retrofitting an existing C program to do this. I have. You'll give it up soon :-)
Again -- the subject is "inherent code performance advantages". The average C program full of "if (errcode) goto errorHandler" is irrelevant to this discussion. BTW I would indeed add "D has exceptions" to your list.
 (5) it just doesn't happen in C code - it's too hard

 (6) I pretty much never see this happening in real C code - 
 it's not so trivial


 Interestingly, things
 that are encouraged in Ada (this is an array of integers of 
 range 0..30, see
 value range propagation) are much harder to recompute with 
 whole program
 optimization and D lacks them.
You can do these in D with a library type.
And get the same (theoretical) advantages for optimizations? Perhaps.
Dec 09 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/9/2013 6:24 AM, Araq wrote:
 ("When in doubt, assume it modifies this location.")
And it's usually in doubt, often enough to make that optimization a pipe dream.
 (2) is impractical because there's no way for the programmer to detect if his
 call stack is pure or not, so he can't reasonably fix it to make it pure.
Well look at the subject: "inherent" vs "practical".
When you're dealing with real code, it might as well be inherent. Your argument boils down to "a sufficiently smart compiler" which will never exist and "programmers going to superhuman lengths" which never happens to try and justify that C is the fastest. You might as well claim that a cyclist could bicycle from Seattle to Portland and average 30 mph, because he can hit 30 in a sprint.
 (3) does require whole program analysis, which is done (as I said) by the
 linker, and is not normally done. Will gcc/ld do it? Nope. (It's not so
trivial.)
Both GCC and LLVM can perform link time optimizations.
Inlining across source files?
 (4) just try retrofitting an existing C program to do this. I have. You'll
 give it up soon :-)
Again -- the subject is "inherent code performance advantages".
It's inherent, as C does not support this, everything about C works against you if you try it, and those who try are punished. You can't even use much of the standard library if you try this.
Dec 09 2013
next sibling parent reply "Araq" <rumpf_a web.de> writes:
On Monday, 9 December 2013 at 19:19:46 UTC, Walter Bright wrote:
 On 12/9/2013 6:24 AM, Araq wrote:
 ("When in doubt, assume it modifies this location.")
And it's usually in doubt, often enough to make that optimization a pipe dream.
I disagree.
 (2) is impractical because there's no way for the programmer 
 to detect if his
 call stack is pure or not, so he can't reasonably fix it to 
 make it pure.
Well look at the subject: "inherent" vs "practical".
When you're dealing with real code, it might as well be inherent.
*shrug* so go ahead and redefine the meaning of words as you please. That language X is faster than C in "practice" because X is much more developer friendly and thus you can tweak your code much easier etc. is an argument of every language out there.
Dec 09 2013
next sibling parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Monday, 9 December 2013 at 19:45:34 UTC, Araq wrote:
 That language X is faster than C in "practice" because X is 
 much more developer friendly and thus you can tweak your code 
 much easier etc. is an argument of every language out there.
Not really. It's very hard to beat sensibly written C* in most languages. D is one of the few where you have a chance on anything other than very specific benchmarks. Others include fortran and c++. *Note: not the impossible, perfect C that no one can actually write for more than ~1000 lines. Just good, straightforward, normal C code.
Dec 09 2013
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/9/2013 11:45 AM, Araq wrote:
 On Monday, 9 December 2013 at 19:19:46 UTC, Walter Bright wrote:
 On 12/9/2013 6:24 AM, Araq wrote:
 ("When in doubt, assume it modifies this location.")
And it's usually in doubt, often enough to make that optimization a pipe dream.
I disagree.
I'm not without some experience with this - I have written a data flow analysis optimizer (it's in dmd) and I know how they work.
Dec 09 2013
parent reply "Araq" <rumpf_a web.de> writes:
On Monday, 9 December 2013 at 20:38:34 UTC, Walter Bright wrote:
 On 12/9/2013 11:45 AM, Araq wrote:
 On Monday, 9 December 2013 at 19:19:46 UTC, Walter Bright 
 wrote:
 On 12/9/2013 6:24 AM, Araq wrote:
 ("When in doubt, assume it modifies this location.")
And it's usually in doubt, often enough to make that optimization a pipe dream.
I disagree.
I'm not without some experience with this - I have written a data flow analysis optimizer (it's in dmd) and I know how they work.
Well so do I.
Dec 09 2013
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 12/9/2013 1:51 PM, Araq wrote:
 On Monday, 9 December 2013 at 20:38:34 UTC, Walter Bright wrote:
 On 12/9/2013 11:45 AM, Araq wrote:
 On Monday, 9 December 2013 at 19:19:46 UTC, Walter Bright wrote:
 On 12/9/2013 6:24 AM, Araq wrote:
 ("When in doubt, assume it modifies this location.")
And it's usually in doubt, often enough to make that > optimization a pipe
dream. I disagree.
I'm not without some experience with this - I have written a data flow analysis optimizer (it's in dmd) and I know how they work.
Well so do I.
Have you instrumented one and then run it on real programs to see how often certain optimizations can be applied? Because I have, and these sorts of optimization opportunities rarely occur. "It's always something" that prevents it because the worst possible case always is the one that rules. It can be as simple as retrieving a pointer from a data structure. Or one of the callers to a function is getting its pointer from an API that you didn't have source for (like the operating system). Or it came from malloc(). Etc.
Dec 09 2013
prev sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Monday, 9 December 2013 at 21:51:09 UTC, Araq wrote:
 On Monday, 9 December 2013 at 20:38:34 UTC, Walter Bright wrote:
 On 12/9/2013 11:45 AM, Araq wrote:
 On Monday, 9 December 2013 at 19:19:46 UTC, Walter Bright 
 wrote:
 On 12/9/2013 6:24 AM, Araq wrote:
 ("When in doubt, assume it modifies this location.")
And it's usually in doubt, often enough to make that optimization a pipe dream.
I disagree.
I'm not without some experience with this - I have written a data flow analysis optimizer (it's in dmd) and I know how they work.
Well so do I.
My experience also matches Walter's.
Dec 09 2013
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 09/12/13 20:45, Araq wrote:
 That language X is faster than C in "practice" because X is much more developer
 friendly and thus you can tweak your code much easier etc. is an argument of
 every language out there.
Yes, but most languages (certainly most "friendly" ones) do not allow you to drill down into the details in the way that D does. The claim that's made for most languages in my experience is that, sure, the language is ultimately slower, but the developer time saved is worth more than the performance hit, and if you ever _really_ need to gain that extra performance, you can always drop down into C. Of course, that assumption doesn't hold for some domains (e.g. intensive scientific simulation, games...) which is why C/C++ still has a significant presence there. By contrast with D you get all that friendliness of refactoring and redesigning and extra time to experiment with alternatives, but in a language which is speed-wise on a par with C anyway; and if its higher-level constructs cause any problems, you can drill down to micro-management of your program _while still writing in D_. That's why D is very useful for heavy-duty scientific simulation and why unlike most other friendly languages it's a genuine contender for games programming.
Dec 09 2013
prev sibling next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Dec 10, 2013 at 08:28:14AM +0100, Joseph Rushton Wakeling wrote:
 On 09/12/13 20:45, Araq wrote:
That language X is faster than C in "practice" because X is much more
developer friendly and thus you can tweak your code much easier etc.
is an argument of every language out there.
Yes, but most languages (certainly most "friendly" ones) do not allow you to drill down into the details in the way that D does. The claim that's made for most languages in my experience is that, sure, the language is ultimately slower, but the developer time saved is worth more than the performance hit, and if you ever _really_ need to gain that extra performance, you can always drop down into C. Of course, that assumption doesn't hold for some domains (e.g. intensive scientific simulation, games...) which is why C/C++ still has a significant presence there. By contrast with D you get all that friendliness of refactoring and redesigning and extra time to experiment with alternatives, but in a language which is speed-wise on a par with C anyway; and if its higher-level constructs cause any problems, you can drill down to micro-management of your program _while still writing in D_. That's why D is very useful for heavy-duty scientific simulation and why unlike most other friendly languages it's a genuine contender for games programming.
Recently I rewrote one of my personal pet projects in D. It turned out a lot better than the original C version -- D's high-level features made it easier to implement complex functionality with relatively simple (and readable!) code. Then I came to a CPU-intensive bit, and initially things didn't look very good: a particular medium-sized problem that I was using as a real-life test case ran in seconds in the C version, but took a very long time with the D version (about 40 *minutes*, if I remember). I was a bit disappointed, but remembered that the D version was still not yet refined. It turned out that I had overlooked a simple but very significant optimization present in the C version that hadn't been implemented in the D version yet. (Basically, it's a brute-force combinatorial algorithm, so the complexity is exponential, and every little problem reduction can make a big difference. In this case, it was a matter of recognizing the equivalence of certain problem combinations that allowed the reduction of the search space by a factor of about n factorial.) In the original C code, it took quite a while to implement this optimization because ... well, in C, you had to spell out every last thing, otherwise it just won't work. In D, I kicked a crude version of it out in under a day. Result? The D version now runs faster than the C version -- perhaps up to an order of magnitude. I was suitably impressed. It's not all roses and flowers yet, of course -- the D version has some memory usage issues that I need to work on, but the fact that an almost optimal-performing version of the code could be done so smoothly in such a short time speaks good things about D. Plus, D's ease of expression made it so easy to implement new algorithms without incurring runtime costs -- I easily implemented an A* search algorithm instead of the original plain BFS, and now, for certain problems, the D version flat out beats the C version in terms of performance. Could I have implemented an A* search in the C version? Sure, after weeks or months of careful reworking of the C code to make sure I didn't break anything or introduce any segfaults. Arguably, the result would perform better than D. But the fact of the matter is, with D, I achieved extremely good performance for just a few days' work. And I think that's the crux of the issue here: sure in C you can code in a way that will beat every other language out there. But it requires a lot of careful effort to get you there (not to mention dealing with all the associated pitfalls). D may not get you all the way to absolute every-last-drop-from-the-CPU performance, but it gets you 90% of the way there, with far, far less effort. Most of the time, this is a far better ROI than in C. T -- Three out of two people have difficulties with fractions. -- Dirk Eddelbuettel
Dec 10 2013
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/10/2013 12:21 AM, H. S. Teoh wrote:
 Result? The D version now runs faster than the C version -- perhaps up
 to an order of magnitude.
This case history would make a great blog post.
Dec 10 2013
parent "Atila Neves" <atila.neves gmail.com> writes:
On Tuesday, 10 December 2013 at 08:28:12 UTC, Walter Bright wrote:
 On 12/10/2013 12:21 AM, H. S. Teoh wrote:
 Result? The D version now runs faster than the C version -- 
 perhaps up
 to an order of magnitude.
This case history would make a great blog post.
+1. My opinion _might_ be just a tad biased given that I just wrote one myself. :)
Dec 10 2013
prev sibling parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Tue, 10 Dec 2013 00:21:16 -0800
schrieb "H. S. Teoh" <hsteoh quickfur.ath.cx>:

 D may not get you all the way to absolute every-last-drop-from-the-CPU
 performance
I firmly disagree. -- Marco
Dec 10 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/10/2013 1:01 PM, Marco Leise wrote:
 Am Tue, 10 Dec 2013 00:21:16 -0800
 schrieb "H. S. Teoh" <hsteoh quickfur.ath.cx>:

 D may not get you all the way to absolute every-last-drop-from-the-CPU
 performance
I firmly disagree.
Especially since if you write C code in D, you will get exactly C results.
Dec 10 2013
parent reply "Dicebot" <public dicebot.lv> writes:
On Tuesday, 10 December 2013 at 20:36:22 UTC, Walter Bright wrote:
 On 12/10/2013 1:01 PM, Marco Leise wrote:
 Am Tue, 10 Dec 2013 00:21:16 -0800
 schrieb "H. S. Teoh" <hsteoh quickfur.ath.cx>:

 D may not get you all the way to absolute 
 every-last-drop-from-the-CPU
 performance
I firmly disagree.
Especially since if you write C code in D, you will get exactly C results.
I think it is better to rephrase it as "writing C code in D is possible but even less convenient than in C".
Dec 10 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/10/2013 12:39 PM, Dicebot wrote:
 I think it is better to rephrase it as "writing C code in D is possible but
even
 less convenient than in C".
Why would it be less convenient? At the least, it'll compile a lot faster!
Dec 10 2013
next sibling parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Tuesday, 10 December 2013 at 21:05:53 UTC, Walter Bright wrote:
 At the least, it'll compile a lot faster!
Small C programs compile a *lot* faster than small D programs that use Phobos. import std.stdio; == add half a second to your compile time. $ time dmd hellod.d user 0m0.649s sys 0m0.102s $ time gcc helloc.c user 0m0.095s sys 0m0.039s yikes, even doing printf in D is slow nowadays $ time dmd hellod.d user 0m0.212s sys 0m0.058s Larger D programs do better, of course, at least if you compile all the files at once (and don't use so much CTFE that it starts thrashing the swap file).
Dec 10 2013
next sibling parent reply Paulo Pinto <pjmlp progtools.org> writes:
Am 10.12.2013 22:16, schrieb Adam D. Ruppe:
 On Tuesday, 10 December 2013 at 21:05:53 UTC, Walter Bright wrote:
 At the least, it'll compile a lot faster!
Small C programs compile a *lot* faster than small D programs that use Phobos. import std.stdio; == add half a second to your compile time. $ time dmd hellod.d user 0m0.649s sys 0m0.102s $ time gcc helloc.c user 0m0.095s sys 0m0.039s yikes, even doing printf in D is slow nowadays $ time dmd hellod.d user 0m0.212s sys 0m0.058s Larger D programs do better, of course, at least if you compile all the files at once (and don't use so much CTFE that it starts thrashing the swap file).
Those are implementation issues, right? -- Paulo
Dec 10 2013
parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Tuesday, 10 December 2013 at 21:28:41 UTC, Paulo Pinto wrote:
 Those are implementation issues, right?
Yeah, a lot of the blame can be placed on the intertwined phobos modules. D without phobos is a lot faster to compile (and produces significantly smaller exes).
Dec 10 2013
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Dec 10, 2013 at 10:52:10PM +0100, Adam D. Ruppe wrote:
 On Tuesday, 10 December 2013 at 21:28:41 UTC, Paulo Pinto wrote:
Those are implementation issues, right?
Yeah, a lot of the blame can be placed on the intertwined phobos modules. D without phobos is a lot faster to compile (and produces significantly smaller exes).
That's why I proposed reducing Phobos interdependencies. Didn't sound like it caught on, though. :-( T -- This sentence is false.
Dec 10 2013
prev sibling next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Dec 10, 2013 at 10:16:25PM +0100, Adam D. Ruppe wrote:
 On Tuesday, 10 December 2013 at 21:05:53 UTC, Walter Bright wrote:
At the least, it'll compile a lot faster!
Small C programs compile a *lot* faster than small D programs that use Phobos. import std.stdio; == add half a second to your compile time.
That's not too bad. One of my template-heavy projects take almost 10 seconds to compile just a single source file. And a single string import can add up to 3-4 seconds (well, probably due to CTFE since I call split() on the string). T -- Life is unfair. Ask too much from it, and it may decide you don't deserve what you have now either.
Dec 10 2013
next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Tuesday, 10 December 2013 at 21:39:12 UTC, H. S. Teoh wrote:
 One of my template-heavy projects take almost 10 seconds to 
 compile just a single source file.
Always compile all your D files at once, that's my tip, otherwise you'll pay that cost every time the file is imported! But yeah, my work app can take up to a full minute to compile, lots of ctfe there. D can be super fast, but can also be slow with the current impl.
Dec 10 2013
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/10/2013 1:37 PM, H. S. Teoh wrote:
 That's not too bad. One of my template-heavy projects take almost 10
 seconds to compile just a single source file. And a single string import
 can add up to 3-4 seconds (well, probably due to CTFE since I call
 split() on the string).
That isn't C-style code. The issue is convenience of writing C code in D vs C.
Dec 10 2013
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Dec 10, 2013 at 03:48:51PM -0800, Walter Bright wrote:
 On 12/10/2013 1:37 PM, H. S. Teoh wrote:
That's not too bad. One of my template-heavy projects take almost 10
seconds to compile just a single source file. And a single string
import can add up to 3-4 seconds (well, probably due to CTFE since I
call split() on the string).
That isn't C-style code. The issue is convenience of writing C code in D vs C.
So you're trying to say that it's easier to write C code in D, rather than in C? I thought this thread was about the inherent advantages of D over C. And I agree that D has a lot of advantages, but it would be a lie to say that there are no disadvantages. T -- All problems are easy in retrospect.
Dec 10 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/10/2013 4:10 PM, H. S. Teoh wrote:
 On Tue, Dec 10, 2013 at 03:48:51PM -0800, Walter Bright wrote:
 On 12/10/2013 1:37 PM, H. S. Teoh wrote:
 That's not too bad. One of my template-heavy projects take almost 10
 seconds to compile just a single source file. And a single string
 import can add up to 3-4 seconds (well, probably due to CTFE since I
 call split() on the string).
That isn't C-style code. The issue is convenience of writing C code in D vs C.
So you're trying to say that it's easier to write C code in D, rather than in C? I thought this thread was about the inherent advantages of D over C.
I was referring specifically to Dicebot's post as ancestor: -- I think it is better to rephrase it as "writing C code in D is possible but even less convenient than in C". --
 And I agree that D has a lot of advantages, but it would be a lie to say
 that there are no disadvantages.
I don't believe anyone made that claim.
Dec 10 2013
parent reply "ed" <sillymongrel gmail.com> writes:
On Wednesday, 11 December 2013 at 03:33:47 UTC, Walter Bright 
wrote:
[snip]

 The issue is convenience of writing C code in D vs C.
So you're trying to say that it's easier to write C code in D, rather than in C? I thought this thread was about the inherent advantages of D over C.
I was referring specifically to Dicebot's post as ancestor:
[snip] I am finding C is much easier and more pleasant to write with DMD. At work we're forced, under duress, to write C. I just got a new project with a loose deadline so I thought I'd do a crazy experiment to make it interesting... (NOTE: I say "under duress" but I secretly like C/C++, especially C++11/14.) I'm writing my C code with DMD. When tested and tweaked I do a final compile with C compiler (test once more) then commit for our QA to pick up. Occasionally I'll compile with the C compiler to ensure I haven't leaked any D into the code and to minimise the #include fixups at the end. Currently this is about 20 C-(D) files with approx. 12,000-15,000 LOC. I doubt this workflow would scale much further, although it doesn't look like becoming an issue yet. My experiment is a success IMO. My C code is much cleaner, safer and more maintainable because of it. Yes, I know I could write C like this without DMD ... but I'm lazy and fall back into bad C habits :-) I now advocate that students should be taught C programming with the DMD compiler :D Cheers, Ed
Dec 11 2013
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/11/2013 6:11 PM, ed wrote:
 I am finding C is much easier and more pleasant to write with DMD.
I find the same thing!
 At work we're forced, under duress, to write C.
My condolences!
 I'm writing my C code with DMD. When tested and tweaked I do a final compile
 with C compiler (test once more) then commit for our QA to pick up.
 Occasionally I'll compile with the C compiler to ensure I haven't leaked any D
 into the code and to minimise the #include fixups at the end.
Wow. This is a pretty interesting use case.
 Currently this is about 20 C-(D) files with approx. 12,000-15,000 LOC. I doubt
 this workflow would scale much further, although it doesn't look like becoming
 an issue yet.

 My experiment is a success IMO. My C code is much cleaner, safer and more
 maintainable because of it. Yes, I know I could write C like this without DMD
 ... but I'm lazy and fall back into bad C habits :-)

 I now advocate that students should be taught C programming with the DMD
 compiler :D
This really is cool. BTW, this sounds a lot like when I used to develop real mode MSDOS programs. An errant pointer in MSDOS would frequently crash the system and even scramble the hard disk. It was pretty bad. Therefore, I'd do all my development on a protected mode operating system (Windows NT or OS/2 16 bit), and only when it was bug free would I even attempt to bring it up under MSDOS. This approach saved me endless hours of misery.
Dec 11 2013
parent "Paulo Pinto" <pjmlp progtools.org> writes:
On Thursday, 12 December 2013 at 03:34:16 UTC, Walter Bright
wrote:
 On 12/11/2013 6:11 PM, ed wrote:
 ....
BTW, this sounds a lot like when I used to develop real mode MSDOS programs. An errant pointer in MSDOS would frequently crash the system and even scramble the hard disk. It was pretty bad. Therefore, I'd do all my development on a protected mode operating system (Windows NT or OS/2 16 bit), and only when it was bug free would I even attempt to bring it up under MSDOS. This approach saved me endless hours of misery.
I used Turbo Pascal instead. :)
Dec 12 2013
prev sibling next sibling parent reply "Paulo Pinto" <pjmlp progtools.org> writes:
On Thursday, 12 December 2013 at 02:12:00 UTC, ed wrote:
 On Wednesday, 11 December 2013 at 03:33:47 UTC, Walter Bright 
 wrote:
 [snip]

 The issue is convenience of writing C code in D vs C.
So you're trying to say that it's easier to write C code in D, rather than in C? I thought this thread was about the inherent advantages of D over C.
I was referring specifically to Dicebot's post as ancestor:
[snip] I am finding C is much easier and more pleasant to write with DMD. At work we're forced, under duress, to write C. I just got a new project with a loose deadline so I thought I'd do a crazy experiment to make it interesting... (NOTE: I say "under duress" but I secretly like C/C++, especially C++11/14.) I'm writing my C code with DMD. When tested and tweaked I do a final compile with C compiler (test once more) then commit for our QA to pick up. Occasionally I'll compile with the C compiler to ensure I haven't leaked any D into the code and to minimise the #include fixups at the end. Currently this is about 20 C-(D) files with approx. 12,000-15,000 LOC. I doubt this workflow would scale much further, although it doesn't look like becoming an issue yet. My experiment is a success IMO. My C code is much cleaner, safer and more maintainable because of it. Yes, I know I could write C like this without DMD ... but I'm lazy and fall back into bad C habits :-) I now advocate that students should be taught C programming with the DMD compiler :D Cheers, Ed
Currently I always advocate that C and C++ development should always be done with warnings as errors enabled, coupled with static analyzers at very least during CI builds, breaking them if anything is found. Nice story though, thanks for sharing. -- Paulo
Dec 12 2013
next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 12/12/13 10:01, Paulo Pinto wrote:
 Currently I always advocate that C and C++ development should
 always be done with warnings as errors enabled, coupled with
 static analyzers at very least during CI builds, breaking them if
 anything is found.
I do think I owe quite a bit to the university professor who took the scientific programming course, for instructing us to compile with gcc -ansi -pedantic -Wall Not as comprehensive as -Werror etc., but still a good way to be started off in programming life. It's meant that I've always subsequently appreciated the existence and value of language standards. And it's generated much amusement among academic colleagues: "Finally, there is such a virtuous saint among us!" or words to that effect :-P
Dec 12 2013
prev sibling next sibling parent reply "Dicebot" <public dicebot.lv> writes:
On Thursday, 12 December 2013 at 09:01:17 UTC, Paulo Pinto wrote:
 Currently I always advocate that C and C++ development should
 always be done with warnings as errors enabled, coupled with
 static analyzers at very least during CI builds, breaking them 
 if
 anything is found.
I literally can't imagine any large C project surviving any long without mandatory doing all listed stuff. It gets to state of unmaintainable insanity so fast. That said, there are very different C projects. When I am speaking that coding C in D is less convenient than in C I don't mean some "normal" but performance-intensive application. I can't imagine anyone picking C motivated only by performance - it is more about being very controllable. Of course with modern optimizers C can no more be called "macro assembler" but it is still much much closer to that than D. To remove all "smart" side-effects in D you need to get rid of all druntime, avoid using some language features and resort to inline assembly relatively often. It is definitely possible and Adam has done some very nice job to prove it. But it leaves you with a very crippled language that does not even help you in sticking with that crippled subset. At this point you really start asking yourself - what does this give me over raw C to motivate the transition? So far I don't see anything convincing.
Dec 12 2013
next sibling parent "Tobias Pankrath" <tobias pankrath.net> writes:
On Thursday, 12 December 2013 at 11:16:07 UTC, Dicebot wrote:
 what does this give me over raw C to motivate the
 transition? So far I don't see anything convincing.
Every time I write #define in one of my 8bit μC pet projects, I know a reason.
Dec 12 2013
prev sibling next sibling parent reply Manu <turkeyman gmail.com> writes:
On 12 December 2013 21:16, Dicebot <public dicebot.lv> wrote:

 On Thursday, 12 December 2013 at 09:01:17 UTC, Paulo Pinto wrote:

 Currently I always advocate that C and C++ development should
 always be done with warnings as errors enabled, coupled with
 static analyzers at very least during CI builds, breaking them if
 anything is found.
I literally can't imagine any large C project surviving any long without mandatory doing all listed stuff. It gets to state of unmaintainable insanity so fast.
I feel quite the opposite, I would say that about C++ personally. I've built a C codebase from the ground over the course of a decade with ~25 programmers. It takes discipline, and a certainly sense of simplicity in your solutions. I personally advocate C over C++ for this very reason, it emphasises simplicity in your solutions. It's impossible to get carried away and create the sort of unmaintainable bullshit that C++ leads to. I like C, I just find it verbose, and prone to boiler plate, which has a tendency to waste programmers time... and what is more valuable than a programmers time? That said, there are very different C projects. When I am
 speaking that coding C in D is less convenient than in C I don't
 mean some "normal" but performance-intensive application. I can't
 imagine anyone picking C motivated only by performance - it is
 more about being very controllable. Of course with modern
 optimizers C can no more be called "macro assembler" but it is
 still much much closer to that than D. To remove all "smart"
 side-effects in D you need to get rid of all druntime, avoid
 using some language features and resort to inline assembly
 relatively often. It is definitely possible and Adam has done
 some very nice job to prove it. But it leaves you with a very
 crippled language that does not even help you in sticking with
 that crippled subset. At this point you really start asking
 yourself - what does this give me over raw C to motivate the
 transition? So far I don't see anything convincing.
I still consider C a macro assembler... I can easily (and usually do) visualise the asm output I expect the compiler to produce while I'm coding. If I'm writing performance intensive code, I am constantly disassembling and checking that the compiler is producing the code I am expecting. This feels normal to me. What would you want inline assembly for in D? Inline assembly is almost always a mistake, unless you're writing a driver. You can't possibly schedule code better than the compiler. And in my experience, without breaking the ABI, I don't know any constructs I could produce manually in assembly that I can't easily coerce the compiler to generate for me (with better scheduling). Perhaps prefetching branch prediction hinting, which the compiler would typically require running a profile guided optimisation pass to generate, but there are intrinsics to insert those manually which don't interrupt the compiler's ability to reschedule the function.
Dec 12 2013
next sibling parent "Paulo Pinto" <pjmlp progtools.org> writes:
On Thursday, 12 December 2013 at 11:42:12 UTC, Manu wrote:
 On 12 December 2013 21:16, Dicebot <public dicebot.lv> wrote:

 On Thursday, 12 December 2013 at 09:01:17 UTC, Paulo Pinto 
 wrote:

 Currently I always advocate that C and C++ development should
 always be done with warnings as errors enabled, coupled with
 static analyzers at very least during CI builds, breaking 
 them if
 anything is found.
I literally can't imagine any large C project surviving any long without mandatory doing all listed stuff. It gets to state of unmaintainable insanity so fast.
I feel quite the opposite, I would say that about C++ personally. I've built a C codebase from the ground over the course of a decade with ~25 programmers. It takes discipline, and a certainly sense of simplicity in your solutions. I personally advocate C over C++ for this very reason, it emphasises simplicity in your solutions. It's impossible to get carried away and create the sort of unmaintainable bullshit that C++ leads to. I like C, I just find it verbose, and prone to boiler plate, which has a tendency to waste programmers time... and what is more valuable than a programmers time?
I favor C++ over C, thanks to the safer constructs it offers me with a type safety closer to the Pascal family of languages, that C will never be able to offer. However I tend to code very seldom in C or C++ nowadays, besides hobby projects, as the enterprise world nowadays is all about GC enabled languages, with a little C++ for performance hotspots. In any case, given my enterprise experience with subcontractors, I think it is very hard to find good developers that are able to write error free C or C++ code without lots of enforced guidelines to guide them screaming along the way. -- Paulo
Dec 12 2013
prev sibling parent reply "Dicebot" <public dicebot.lv> writes:
On Thursday, 12 December 2013 at 11:42:12 UTC, Manu wrote:
 I've built a C codebase from the ground over the course of a 
 decade with
 ~25 programmers.
 It takes discipline, and a certainly sense of simplicity in 
 your solutions.
It may work if you can afford to guarantee certain level of competence of majority of programmers in the team but I think is exception in practice, not rule. Also I had a bit larger teams in mind as it tends to happen with enterprise C :)
 and what is more valuable than a
 programmers time?
At some point new servers + server maintenance becomes more expensive than programmers time. Much more expensive.
 I still consider C a macro assembler... I can easily (and 
 usually do)
 visualise the asm output I expect the compiler to produce while 
 I'm coding.
 If I'm writing performance intensive code, I am constantly 
 disassembling
 and checking that the compiler is producing the code I am 
 expecting. This
 feels normal to me.
Did you use many different compilers? I am afraid that doing that on a common basis is feat of strength beyond my imagination :)
 What would you want inline assembly for in D? Inline assembly 
 is almost
 always a mistake, unless you're writing a driver.
I can't find code Adam used to provide minimal d runtime stubs to compile C-like programs but he was forced to use in-line assembly there in few cases. Can't remember details, sorry. And of course I am speaking about drivers / kernels / barebone. I can't imagine any other domain where using C is still absolutely necessary for practical reasons.
 You can't possibly
 schedule code better than the compiler.
 ...
I am not implying that one should do anything by hand because compiler is bad at it. I have not actually used inline assembly with C even a single time in my life. That wasn't about it.
Dec 12 2013
next sibling parent reply Manu <turkeyman gmail.com> writes:
On 12 December 2013 22:21, Dicebot <public dicebot.lv> wrote:

 On Thursday, 12 December 2013 at 11:42:12 UTC, Manu wrote:

 I've built a C codebase from the ground over the course of a decade with
 ~25 programmers.
 It takes discipline, and a certainly sense of simplicity in your
 solutions.
It may work if you can afford to guarantee certain level of competence of majority of programmers in the team but I think is exception in practice, not rule. Also I had a bit larger teams in mind as it tends to happen with enterprise C :)
Completely true. Fortunately I've always worked on tech/engine teams, which are mostly populated with seniors, or competent up-and-comers. and what is more valuable than a
 programmers time?
At some point new servers + server maintenance becomes more expensive than programmers time. Much more expensive.
But that's not a concern for typical programmers. That the responsibility of sysadmins. What I meant was, 'what's more valuable [to a programmer]...' I still consider C a macro assembler... I can easily (and usually do)
 visualise the asm output I expect the compiler to produce while I'm
 coding.
 If I'm writing performance intensive code, I am constantly disassembling
 and checking that the compiler is producing the code I am expecting. This
 feels normal to me.
Did you use many different compilers? I am afraid that doing that on a common basis is feat of strength beyond my imagination :)
Yup. Over the past 10 years, my day job involved: VisualC (it's changed a LOT over the years), GCC (for many architectures), CodeWarrior (for many architectures), SNC (for many architectures), Clang, and some other proprietary compilers. You learn each of their quirks with time, and also how to reconcile the differences between them. You also learn every preprocessor trick imaginable... Worse than the compilers is the standard libraries, which are anything but standard. In the end the ONLY function from the CRT that we called, was sprintf(). We had our own implementations of everything else we used. I'm absolutely conscious of these sorts of issues when I consider my approach to D. Many of my vocal opinions stem from a desire to mitigate these sorts of problems in the future, and make sure it is possible to directly express codegen concepts that I've previously only been able to indirectly express in C compilers, which often requires some coercion for different compilers, and invariably leads to #ifdef. It's important to be able to explicitly express low-level codegen concepts, even if these are rarely used features, it means it's possible to write code that is reliably portable. Sadly, most people really don't care too much about portability. What would you want inline assembly for in D? Inline assembly is almost
 always a mistake, unless you're writing a driver.
I can't find code Adam used to provide minimal d runtime stubs to compile C-like programs but he was forced to use in-line assembly there in few cases. Can't remember details, sorry.
Right. It's usually necessarily for hacks that have to interact with, or bend/subvert the ABI. But that's a pretty rare necessity, not a requirement in day-to-day code. And of course I am speaking about drivers / kernels / barebone. I can't
 imagine any other domain where using C is still absolutely necessary for
 practical reasons.
You mean C-like-native-languages? There's not really anything C offers that C++/D doesn't also offer at the lowest level. Our choice to use C rather than C++ was in a sense, a funny way to enforce a coding standard. Like I say, it forces simplicity, and a consistent approach to problems. You can't possibly
 schedule code better than the compiler.
 ...
I am not implying that one should do anything by hand because compiler is bad at it. I have not actually used inline assembly with C even a single time in my life. That wasn't about it.
The only thing I've ever had to use it for in recent years is manually fiddling with flags registers, or interacting with hardware-specific concepts that C/C++ doesn't have ways to express (interrupt levels, privilege control, MMU control, context switching). Also, SIMD. C++ compilers traditionally didn't have any vector support, so before intrinsics were common, you had to do all SIMD code in asm >_< .. Fortunately, that's a thing of the past.
Dec 12 2013
parent reply "Dicebot" <public dicebot.lv> writes:
On Thursday, 12 December 2013 at 13:17:14 UTC, Manu wrote:
 But that's not a concern for typical programmers. That the 
 responsibility
 of sysadmins.
 What I meant was, 'what's more valuable [to a programmer]...'
Leaning dangerously close to philosophy here :)
 Did you use many different compilers? I am afraid that doing 
 that on a
 common basis is feat of strength beyond my imagination :)
Yup. Over the past 10 years, my day job involved: ...
Well, you are much more proficient and experienced C programmer than me :P (And than I will ever be considering I have no desire to return to that world) It is clearly beyond my imagination. I have been investigating dissassembly only to verify which stuff actually gets there and which not and debugging of course.
 And of course I am speaking about drivers / kernels / 
 barebone. I can't
 imagine any other domain where using C is still absolutely 
 necessary for
 practical reasons.
You mean C-like-native-languages? There's not really anything C offers that C++/D doesn't also offer at the lowest level.
Right now problem is not with stuff C offers and D not. It is stuff that D offers on top and you are forced to fight to get back to C level of simplicity.
 Our choice to use C rather than C++ was in a sense, a funny way 
 to enforce
 a coding standard. Like I say, it forces simplicity, and a 
 consistent
 approach to problems.
When I was speaking about "domains where C is still necessary" I was not opposing it to C++ but other modern languages in general. Even real-time service development is quite possible using stuff like Erlang these days. Okay, there is also gamedev which I won't dare to speak about :) But in general there is no much sense to speak about replacing C and imagining any "normal" userspace application, even performance-critical. Some time ago I have been part of project that completely changed my image of what C domain can be. It remains most mind-blowing experience in my programming history (which is damn short of course but can't resist dramatical intro). Project was in mobile networking / LTE domain with huge amount of people involved all over the world. Specifically our team in Latvia was responsible for node cluster software which acted essentially as a giant router + firewall. Actual nodes were custom multi-core MIPS machines with h/w implementation of event loop and part of IP stack handling. All actual packet processing code ran as barebone executables - no OS, not virtual memory or any fancy stuff, just platform SDK you take to build single processing binary which is than loaded at hard-coded memory address (all remaining memory is pre-allocated for packet handling purposes). No really complicated processing algorithms or anything like that but everyone was extremely picky about tiny details how this binary was actually running - making sure no instruction cache misses happen under normal workflow for each core, connection context struct fits into single cache line and stuff like that. And one of my responsibilities was actual performance testing of that system so I could have observed how much of an impact those seemingly small tweaks have made. And now every time I hear that "language X can do all stuff C can" I imagine myself trying to sell this to other guys working on that project and have an extremely hard time doing this even in my imagination.
Dec 13 2013
parent reply Manu <turkeyman gmail.com> writes:
On 14 December 2013 01:42, Dicebot <public dicebot.lv> wrote:

 On Thursday, 12 December 2013 at 13:17:14 UTC, Manu wrote:

 But that's not a concern for typical programmers. That the responsibility
 of sysadmins.
 What I meant was, 'what's more valuable [to a programmer]...'
Leaning dangerously close to philosophy here :)
Can you offer an alternative? :) Our choice to use C rather than C++ was in a sense, a funny way to enforce
 a coding standard. Like I say, it forces simplicity, and a consistent
 approach to problems.
When I was speaking about "domains where C is still necessary" I was not opposing it to C++ but other modern languages in general. Even real-time service development is quite possible using stuff like Erlang these days. Okay, there is also gamedev which I won't dare to speak about :) But in general there is no much sense to speak about replacing C and imagining any "normal" userspace application, even performance-critical. Some time ago I have been part of project that completely changed my image of what C domain can be. It remains most mind-blowing experience in my programming history (which is damn short of course but can't resist dramatical intro). Project was in mobile networking / LTE domain with huge amount of people involved all over the world. Specifically our team in Latvia was responsible for node cluster software which acted essentially as a giant router + firewall. Actual nodes were custom multi-core MIPS machines with h/w implementation of event loop and part of IP stack handling. All actual packet processing code ran as barebone executables - no OS, not virtual memory or any fancy stuff, just platform SDK you take to build single processing binary which is than loaded at hard-coded memory address (all remaining memory is pre-allocated for packet handling purposes). No really complicated processing algorithms or anything like that but everyone was extremely picky about tiny details how this binary was actually running - making sure no instruction cache misses happen under normal workflow for each core, connection context struct fits into single cache line and stuff like that. And one of my responsibilities was actual performance testing of that system so I could have observed how much of an impact those seemingly small tweaks have made. And now every time I hear that "language X can do all stuff C can" I imagine myself trying to sell this to other guys working on that project and have an extremely hard time doing this even in my imagination.
Heh, this sounds pretty much like my life. The same discipline applies to any realtime embedded software (read: video game console). I'm brutally conscious of all the details you mention. I don't think D is incompatible with this at all though. It was early last year... until I convinced Walter to implement align() properly. Now we're good! :) I could still REALLY do with __forceinline though. D doesn't have an effective macro. Obviously, if by 'language X' you mean 'any non-compiled language with pointers', then I totally agree! People who make claims like you say, don't generally know what they're talking about, or what C is actually used for.
Dec 13 2013
parent reply "Dicebot" <public dicebot.lv> writes:
On Friday, 13 December 2013 at 16:28:33 UTC, Manu wrote:
 I could still REALLY do with __forceinline though. D doesn't 
 have an
 effective macro.
 Obviously, if by 'language X' you mean 'any non-compiled 
 language with
 pointers', then I totally agree! People who make claims like 
 you say, don't
 generally know what they're talking about, or what C is 
 actually used for.
I believe (and have posted it over 100 times in NG already :P) D absolutely needs either way to force internal linkage or good LTO symbol elimination. Without preprocessor there is not much you can do to eliminate code duplication other than templates / CTFE - and it bloats resulting executables damn lot, something that was very controllable in C.
Dec 13 2013
parent reply Manu <turkeyman gmail.com> writes:
On 14 December 2013 02:50, Dicebot <public dicebot.lv> wrote:

 On Friday, 13 December 2013 at 16:28:33 UTC, Manu wrote:

 I could still REALLY do with __forceinline though. D doesn't have an
 effective macro.
 Obviously, if by 'language X' you mean 'any non-compiled language with
 pointers', then I totally agree! People who make claims like you say,
 don't
 generally know what they're talking about, or what C is actually used for.
I believe (and have posted it over 100 times in NG already :P) D absolutely needs either way to force internal linkage or good LTO symbol elimination. Without preprocessor there is not much you can do to eliminate code duplication other than templates / CTFE - and it bloats resulting executables damn lot, something that was very controllable in C.
templates aren't guaranteed to inline, and they produce horrible symbol bloat. ctfe isn't inlining, it's pre-computation/runtime elimination. mixin is the closest, but it can't be used effectively in expressions, is horribly dangerous and generally horrible, and requires much keyword pollution. We really do need __forceinline. Walter did agreed on one occasion. He said something like "I've been thinking on it, and I think you might be right", which is almost a mental commitment... so there's hope! :P Sadly it was in a hotel parking lot, and not committed to the eternal historic record (ie, the forum). The alt-compilers have an attribute... if only we could alias attributes (or groups of attributes). Another thing we need... :/
Dec 13 2013
next sibling parent reply "Dicebot" <public dicebot.lv> writes:
On Friday, 13 December 2013 at 17:01:21 UTC, Manu wrote:
 On 14 December 2013 02:50, Dicebot <public dicebot.lv> wrote:

 On Friday, 13 December 2013 at 16:28:33 UTC, Manu wrote:

 I could still REALLY do with __forceinline though. D doesn't 
 have an
 effective macro.
 Obviously, if by 'language X' you mean 'any non-compiled 
 language with
 pointers', then I totally agree! People who make claims like 
 you say,
 don't
 generally know what they're talking about, or what C is 
 actually used for.
I believe (and have posted it over 100 times in NG already :P) D absolutely needs either way to force internal linkage or good LTO symbol elimination. Without preprocessor there is not much you can do to eliminate code duplication other than templates / CTFE - and it bloats resulting executables damn lot, something that was very controllable in C.
templates aren't guaranteed to inline, and they produce horrible symbol bloat. ctfe isn't inlining, it's pre-computation/runtime elimination. mixin is the closest, but it can't be used effectively in expressions, is horribly dangerous and generally horrible, and requires much keyword pollution.
What I mean is that right now symbols always make their way to object file, whenever they are inlined or not. So bloat is completely unaffected by inlining here. And spec currently kind of prevents removing those. One option could have been to fix `export` like http://wiki.dlang.org/DIP45 proposes and get some kind of LTO. Simpler solution I personally would favor is to make template symbols internally linked by default and available only if explicitly aliased. But anything is really better than the current state.
Dec 13 2013
parent Manu <turkeyman gmail.com> writes:
On 14 December 2013 03:07, Dicebot <public dicebot.lv> wrote:

 On Friday, 13 December 2013 at 17:01:21 UTC, Manu wrote:

 On 14 December 2013 02:50, Dicebot <public dicebot.lv> wrote:

  On Friday, 13 December 2013 at 16:28:33 UTC, Manu wrote:
  I could still REALLY do with __forceinline though. D doesn't have an
 effective macro.
 Obviously, if by 'language X' you mean 'any non-compiled language with
 pointers', then I totally agree! People who make claims like you say,
 don't
 generally know what they're talking about, or what C is actually used
 for.
I believe (and have posted it over 100 times in NG already :P) D absolutely needs either way to force internal linkage or good LTO symbol elimination. Without preprocessor there is not much you can do to eliminate code duplication other than templates / CTFE - and it bloats resulting executables damn lot, something that was very controllable in C.
templates aren't guaranteed to inline, and they produce horrible symbol bloat. ctfe isn't inlining, it's pre-computation/runtime elimination. mixin is the closest, but it can't be used effectively in expressions, is horribly dangerous and generally horrible, and requires much keyword pollution.
What I mean is that right now symbols always make their way to object file, whenever they are inlined or not. So bloat is completely unaffected by inlining here. And spec currently kind of prevents removing those. One option could have been to fix `export` like http://wiki.dlang.org/DIP45proposes and get some kind of LTO. Simpler solution I personally would favor is to make template symbols internally linked by default and available only if explicitly aliased. But anything is really better than the current state.
Ah I see what you mean. Yes, you are correct. Any real embedded dev in the future will need to have these issues addressed as a matter of practicality.
Dec 13 2013
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/13/2013 9:01 AM, Manu wrote:
 We really do need __forceinline. Walter did agreed on one occasion. He said
 something like "I've been thinking on it, and I think you might be right",
which
 is almost a mental commitment... so there's hope! :P
 Sadly it was in a hotel parking lot, and not committed to the eternal historic
 record (ie, the forum).
Yes, I remember, and you did convince me!
Dec 13 2013
next sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
13-Dec-2013 22:29, Walter Bright пишет:
 On 12/13/2013 9:01 AM, Manu wrote:
 We really do need __forceinline. Walter did agreed on one occasion. He
 said
 something like "I've been thinking on it, and I think you might be
 right", which
 is almost a mental commitment... so there's hope! :P
 Sadly it was in a hotel parking lot, and not committed to the eternal
 historic
 record (ie, the forum).
Yes, I remember, and you did convince me!
Yay! -- Dmitry Olshansky
Dec 13 2013
prev sibling parent Manu <turkeyman gmail.com> writes:
On 14 December 2013 04:29, Walter Bright <newshound2 digitalmars.com> wrote:

 On 12/13/2013 9:01 AM, Manu wrote:

 We really do need __forceinline. Walter did agreed on one occasion. He
 said
 something like "I've been thinking on it, and I think you might be
 right", which
 is almost a mental commitment... so there's hope! :P
 Sadly it was in a hotel parking lot, and not committed to the eternal
 historic
 record (ie, the forum).
Yes, I remember, and you did convince me!
*gasp*, it's on record!! This is a good day! :P
Dec 13 2013
prev sibling parent reply "jerro" <a a.com> writes:
 The alt-compilers have an attribute... if only we could alias 
 attributes
 (or groups of attributes). Another thing we need... :/
With GDC you can already do this: import gcc.attribute; attribute("forceinline") foo() ...
Dec 14 2013
parent "jerro" <a a.com> writes:
  attribute("forceinline") foo()
should be attribute("forceinline") ReturnType foo() of course.
Dec 14 2013
prev sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Thursday, 12 December 2013 at 12:21:31 UTC, Dicebot wrote:
 I can't find code Adam used to provide minimal d runtime stubs 
 to compile C-like programs but he was forced to use in-line 
 assembly there in few cases. Can't remember details, sorry.
http://arsdnet.net/dcode/minimal.zip (not sure if it still compiles on new dmd, I haven't played with it for months and druntime is a moving target) The main inline asm usage was to make system calls on Linux without libc or to poke the hardware on bare metal; there isn't a lot of it that is strictly necessary. On Thursday, 12 December 2013 at 11:16:07 UTC, Dicebot wrote:
 But it leaves you with a very
 crippled language that does not even help you in sticking with
 that crippled subset. At this point you really start asking
 yourself - what does this give me over raw C to motivate the
 transition? So far I don't see anything convincing.
There's still some nice benefits, you can use the compile time stuff of D, exceptions, classes, custom array types; a lot of the language actually works if you spend the time on it. Though i never did anything serious with it, I stopped at the proof of concept phase.
Dec 12 2013
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/12/2013 3:16 AM, Dicebot wrote:
 To remove all "smart"
 side-effects in D you need to get rid of all druntime, avoid
 using some language features and resort to inline assembly
 relatively often.
I don't see why you'd have to resort to inline assembler in D any more than in C.
 But it leaves you with a very crippled language
Not more "crippled" than C is.
 that does not even help you in sticking with that crippled subset.
Is there a point to having a compiler flag that'll warn you if you use "pure"?
 At this point you really start asking
 yourself - what does this give me over raw C to motivate the
 transition? So far I don't see anything convincing.
Off the top of my head: 1. compile speed 2. dependable sizes of basic types 3. unicode 4. wchar_t that is actually usable 5. thread local storage 6. no global errno being set by the math library functions 7. proper IEEE 754 floating point 8. no preprocessor madness 9. modules 10. being able to pass array types to functions without them degenerating to pointers 11. inline assembler being a part of the language rather than an extension that is in a markedly different format for every compiler 12. forward referencing (no need to declare everything twice) 13. no need for .h files 14. no ridonculous struct tag name space with all those silly typedef struct S { ... } S; declarations. 15. no need for precompiled headers 16. struct alignment as a language feature rather than an ugly extension kludge 17. no #include guard kludges 18. #define BEGIN { is thankfully not possible 19. no need for global variables when qsorting 20. no global locale madness And if you use D features even modestly, such as auto, purity, out variables, safe, const, etc., you can get a large improvement in clarity in function APIs.
Dec 12 2013
next sibling parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Thursday, 12 December 2013 at 17:56:12 UTC, Walter Bright 
wrote:
 5. thread local storage
I think this is a negative. D's TLS has caused me more problems than it has fixed: for example, if you write an in-process COM server in Windows XP, it will crash the host application if you hit almost any druntime call. Why? Because the TLS stuff isn't set up properly when the dll is loaded. Windows Vista managed to fix this, but there's a lot of people who use XP, and this is a big problem. Same thing running D on bare metal. Maybe I can fix this by setting up the segment registers or reading the executable, idk, but __gshared just works in that environment, whereas tls doesn't. As I understand it, the Android and Macintosh operating systems has, or at least had, TLS problems too. I agree with the rest of them, but D's default TLS has been a big pain to me.
Dec 12 2013
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 12/12/2013 10:46 AM, Adam D. Ruppe wrote:
 I agree with the rest of them, but D's default TLS has been a big pain to me.
You're right that TLS on XP with DLLs is a miserable problem. Fortunately, with TLS now standard at least in C++, this problem is going away. And, of course, you can use __gshared instead.
Dec 12 2013
prev sibling parent reply Rainer Schuetze <r.sagitario gmx.de> writes:
On 12.12.2013 19:46, Adam D. Ruppe wrote:
 On Thursday, 12 December 2013 at 17:56:12 UTC, Walter Bright wrote:
 5. thread local storage
I think this is a negative. D's TLS has caused me more problems than it has fixed: for example, if you write an in-process COM server in Windows XP, it will crash the host application if you hit almost any druntime call. Why? Because the TLS stuff isn't set up properly when the dll is loaded. Windows Vista managed to fix this, but there's a lot of people who use XP, and this is a big problem.
Implicite TLS in XP-DLLs has a workaround in druntime for a few years now (emulating it for the system). IIRC Denis has even found a solution how to unload these DLLs later. It's great to have thread local data easily available, but I'm not completely sold on thread-local by default, too. One disadvantage is that both memory and initialization code affect any thread, even if it does not use TLS at all. This doesn't scale for large programs. Usually you should not create a lot of global variables, and if I do, I mostly want them shared. But coming from C++ I often tend to forget to add the modifier. I remember a few years ago, almost every occurrence of TLS in phobos was converted to shared or immutable...
Dec 13 2013
parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Friday, 13 December 2013 at 13:07:32 UTC, Rainer Schuetze 
wrote:
 Implicite TLS in XP-DLLs has a workaround in druntime for a few 
 years now (emulating it for the system). IIRC Denis has even 
 found a solution how to unload these DLLs later.
There must still be some bugs, I tried the COM thing not long ago (I think it was October) and it worked so perfectly on Vista and Win7 yet failed so miserably on XP. D exes are fine, but the dll loaded into another program didn't work well at all.
 Usually you should not create a lot of global variables, and if 
 I do, I mostly want them shared. But coming from C++ I often 
 tend to forget to add the modifier. I remember a few years ago, 
 almost every occurrence of TLS in phobos was converted to 
 shared or immutable...
Aye.
Dec 13 2013
parent Rainer Schuetze <r.sagitario gmx.de> writes:
On 13.12.2013 15:59, Adam D. Ruppe wrote:
 On Friday, 13 December 2013 at 13:07:32 UTC, Rainer Schuetze wrote:
 Implicite TLS in XP-DLLs has a workaround in druntime for a few years
 now (emulating it for the system). IIRC Denis has even found a
 solution how to unload these DLLs later.
There must still be some bugs, I tried the COM thing not long ago (I think it was October) and it worked so perfectly on Vista and Win7 yet failed so miserably on XP. D exes are fine, but the dll loaded into another program didn't work well at all.
Do you use DllMain from http://dlang.org/dll.html#Cinterface? I guess so, if it runs on Vista/Win7. What version of XP are you running? It might be a special version of ntdll.dll that is not supported. (I'll have to dig up supported versions from an older disk, though.) Visual D is full of COM, but it does not use the default COM implementation, see https://d.puremagic.com/issues/show_bug.cgi?id=4092.
Dec 13 2013
prev sibling next sibling parent reply "Max Samukha" <maxsamukha gmail.com> writes:
On Thursday, 12 December 2013 at 17:56:12 UTC, Walter Bright 
wrote:

 11. inline assembler being a part of the language rather than 
 an extension that is in a markedly different format for every 
 compiler
Ahem. If we admit that x86 is not the only ISA in exsistence, then what is (under)specified here http://dlang.org/iasm.html is a platform-specific extension.
Dec 12 2013
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/12/2013 11:57 AM, Max Samukha wrote:
 On Thursday, 12 December 2013 at 17:56:12 UTC, Walter Bright wrote:

 11. inline assembler being a part of the language rather than an extension
 that is in a markedly different format for every compiler
Ahem. If we admit that x86 is not the only ISA in exsistence, then what is (under)specified here http://dlang.org/iasm.html is a platform-specific extension.
I know of at least 3 different C x86 inline assembler syntaxes. This is not convenient, to say the least.
Dec 12 2013
parent reply "Max Samukha" <maxsamukha gmail.com> writes:
On Thursday, 12 December 2013 at 20:06:37 UTC, Walter Bright 
wrote:
 On 12/12/2013 11:57 AM, Max Samukha wrote:
 On Thursday, 12 December 2013 at 17:56:12 UTC, Walter Bright 
 wrote:

 11. inline assembler being a part of the language rather than 
 an extension
 that is in a markedly different format for every compiler
Ahem. If we admit that x86 is not the only ISA in exsistence, then what is (under)specified here http://dlang.org/iasm.html is a platform-specific extension.
I know of at least 3 different C x86 inline assembler syntaxes. This is not convenient, to say the least.
I know that too. I appreciate that you attempted to standardize the asm for x86. But the question is what to do about other targets? What about ARM, MIL, LLVM IR or whatever low-level target a D compiler may compile too? Will those be standardized as part of the language?
Dec 12 2013
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/12/2013 12:16 PM, Max Samukha wrote:
 But the question is what to do about other targets? What about ARM, MIL, LLVM
IR
 or whatever low-level target a D compiler may compile too? Will those be
 standardized as part of the language?
I certainly think they ought to be.
Dec 12 2013
parent reply "Max Samukha" <maxsamukha gmail.com> writes:
On Thursday, 12 December 2013 at 20:24:19 UTC, Walter Bright 
wrote:
 On 12/12/2013 12:16 PM, Max Samukha wrote:
 But the question is what to do about other targets? What about 
 ARM, MIL, LLVM IR
 or whatever low-level target a D compiler may compile too? 
 Will those be
 standardized as part of the language?
I certainly think they ought to be.
Don't you find it somewhat alarming that both alternative compilers follow neither the standard inline asm nor ABI? Maybe it would be wiser to call those standard extentions or whatever than claiming they are part of the language?
Dec 12 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/12/2013 12:33 PM, Max Samukha wrote:
 Don't you find it somewhat alarming that both alternative compilers follow
 neither the standard inline asm nor ABI?
I find it unfortunate. But it also can be difficult and time consuming to reimplement an assembler for those back ends, so I can understand why it isn't a priority.
Dec 12 2013
next sibling parent "Max Samukha" <maxsamukha gmail.com> writes:
On Thursday, 12 December 2013 at 20:46:26 UTC, Walter Bright 
wrote:
 On 12/12/2013 12:33 PM, Max Samukha wrote:
 Don't you find it somewhat alarming that both alternative 
 compilers follow
 neither the standard inline asm nor ABI?
I find it unfortunate. But it also can be difficult and time consuming to reimplement an assembler for those back ends, so I can understand why it isn't a priority.
You know that it is not likely to ever be a priority. That's what will happen: lots of code will be written, using the peculiarities of each implementation. And then nobody will want to change the status quo becase "it will break lots of code".
Dec 12 2013
prev sibling next sibling parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Thu, 12 Dec 2013 12:46:26 -0800
schrieb Walter Bright <newshound2 digitalmars.com>:

 On 12/12/2013 12:33 PM, Max Samukha wrote:
 Don't you find it somewhat alarming that both alternative compilers follow
 neither the standard inline asm nor ABI?
I find it unfortunate. But it also can be difficult and time consuming to reimplement an assembler for those back ends, so I can understand why it isn't a priority.
It's not just that. I found that some discussion and work is necessary about these issues, if D is ever going to have a standard inline assembly language: * GCC would have to support naked functions on x86/amd64 or DMD drop the keyword. * DMD would have to adapt extended inline assembly expressions or the GDC/LDC "downgrade" to basic inline assembly. A "downgrade" needs good arguments though. Not only did D evolve over C, but the same is true for inline assembly. From http://wiki.dlang.org/LDC_inline_assembly_expressions: Being an expression, extended inline expressions are able to return values! Additionally issues regarding inlining of function containing inline asm are mostly not relevant for extended inline assembly expressions. Effectively, extended inline assembly expression can be used to efficiently implement new intrinsics in the compiler. -- Marco
Dec 13 2013
parent reply "Paulo Pinto" <pjmlp progtools.org> writes:
On Friday, 13 December 2013 at 13:57:09 UTC, Marco Leise wrote:
 Am Thu, 12 Dec 2013 12:46:26 -0800
 schrieb Walter Bright <newshound2 digitalmars.com>:

 On 12/12/2013 12:33 PM, Max Samukha wrote:
 Don't you find it somewhat alarming that both alternative 
 compilers follow
 neither the standard inline asm nor ABI?
I find it unfortunate. But it also can be difficult and time consuming to reimplement an assembler for those back ends, so I can understand why it isn't a priority.
It's not just that. I found that some discussion and work is necessary about these issues, if D is ever going to have a standard inline assembly language: * GCC would have to support naked functions on x86/amd64 or DMD drop the keyword. * DMD would have to adapt extended inline assembly expressions or the GDC/LDC "downgrade" to basic inline assembly. A "downgrade" needs good arguments though. Not only did D evolve over C, but the same is true for inline assembly. From http://wiki.dlang.org/LDC_inline_assembly_expressions: Being an expression, extended inline expressions are able to return values! Additionally issues regarding inlining of function containing inline asm are mostly not relevant for extended inline assembly expressions. Effectively, extended inline assembly expression can be used to efficiently implement new intrinsics in the compiler.
Maybe the best way to fix this issue is to follow what other language standards do (C++, Ada) and only define that inline assembly is possible and how the entry point, e.g. asm () looks like. The real inline assembly syntax is then left implementation specific. -- Paulo
Dec 13 2013
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Dec 13, 2013 at 03:30:21PM +0100, Paulo Pinto wrote:
[...]
 Maybe the best way to fix this issue is to follow what other
 language standards do (C++, Ada) and only define that inline
 assembly is possible and how the entry point, e.g. asm () looks
 like.
 
 The real inline assembly syntax is then left implementation
 specific.
But isn't this what Walter was arguing against? He wanted to standardize inline assembly syntax for x86 because leaving it up to implementation resulted in the current mess of Intel syntax vs. GNU syntax (which can be extremely confusing if you're not well-versed in both syntaxes, since the order of operands are swapped and there are some subtle notational differences). T -- Computers shouldn't beep through the keyhole.
Dec 13 2013
next sibling parent Marco Leise <Marco.Leise gmx.de> writes:
Am Fri, 13 Dec 2013 09:34:17 -0800
schrieb "H. S. Teoh" <hsteoh quickfur.ath.cx>:

 On Fri, Dec 13, 2013 at 03:30:21PM +0100, Paulo Pinto wrote:
 [...]
 Maybe the best way to fix this issue is to follow what other
 language standards do (C++, Ada) and only define that inline
 assembly is possible and how the entry point, e.g. asm () looks
 like.
 
 The real inline assembly syntax is then left implementation
 specific.
But isn't this what Walter was arguing against? He wanted to standardize inline assembly syntax for x86 because leaving it up to implementation resulted in the current mess of Intel syntax vs. GNU syntax (which can be extremely confusing if you're not well-versed in both syntaxes, since the order of operands are swapped and there are some subtle notational differences). T
I for one am in favor of not having to write an ASM function 6 times for x86/amd64 * dmd/ldc/gdc. Before that happens I write a mixin generator that translates takes a string of ASM instructions and converts it to all three syntaxes at once. -- Marco
Dec 13 2013
prev sibling parent Paulo Pinto <pjmlp progtools.org> writes:
Am 13.12.2013 18:34, schrieb H. S. Teoh:
 On Fri, Dec 13, 2013 at 03:30:21PM +0100, Paulo Pinto wrote:
 [...]
 Maybe the best way to fix this issue is to follow what other
 language standards do (C++, Ada) and only define that inline
 assembly is possible and how the entry point, e.g. asm () looks
 like.

 The real inline assembly syntax is then left implementation
 specific.
But isn't this what Walter was arguing against? He wanted to standardize inline assembly syntax for x86 because leaving it up to implementation resulted in the current mess of Intel syntax vs. GNU syntax (which can be extremely confusing if you're not well-versed in both syntaxes, since the order of operands are swapped and there are some subtle notational differences). T
Yeah, but it is an easier path to have something standardized that corresponds to reality, than making all frontends agree on the same syntax and semantics for inline assembler. If I understood correctly the current issues, that is. -- Paulo
Dec 13 2013
prev sibling parent reply "David Nadlinger" <code klickverbot.at> writes:
On Thursday, 12 December 2013 at 20:46:26 UTC, Walter Bright 
wrote:
 On 12/12/2013 12:33 PM, Max Samukha wrote:
 Don't you find it somewhat alarming that both alternative 
 compilers follow
 neither the standard inline asm nor ABI?
I find it unfortunate. But it also can be difficult and time consuming to reimplement an assembler for those back ends, so I can understand why it isn't a priority.
LDC in fact implements DMD-style inline assembly (occasionally there are bugs, though, as it's a complete reimplementation). I don't think it would be unreasonable to work towards a common D ABI on the various Posix x86_64 systems, but given that DMD comes with its own bespoke exception handling implementation which doesn't really make sense to implement in GDC/LDC (as libunwind is the platform standard on Linux/… anyway), there is not really much motivation to start work on aligning the other parts of the ABI either. David
Dec 13 2013
next sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
On Friday, 13 December 2013 at 19:07:47 UTC, David Nadlinger 
wrote:
 On Thursday, 12 December 2013 at 20:46:26 UTC, Walter Bright 
 wrote:
 On 12/12/2013 12:33 PM, Max Samukha wrote:
 Don't you find it somewhat alarming that both alternative 
 compilers follow
 neither the standard inline asm nor ABI?
I find it unfortunate. But it also can be difficult and time consuming to reimplement an assembler for those back ends, so I can understand why it isn't a priority.
LDC in fact implements DMD-style inline assembly (occasionally there are bugs, though, as it's a complete reimplementation). I don't think it would be unreasonable to work towards a common D ABI on the various Posix x86_64 systems, but given that DMD comes with its own bespoke exception handling implementation which doesn't really make sense to implement in GDC/LDC (as libunwind is the platform standard on Linux/… anyway), there is not really much motivation to start work on aligning the other parts of the ABI either. David
I'm reinventing it right now for SDC, so it indeed make sense.
Dec 13 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/13/2013 12:06 PM, deadalnix wrote:
 I'm reinventing it right now for SDC, so it indeed make sense.
Reinventing EH or inline asm?
Dec 13 2013
parent "deadalnix" <deadalnix gmail.com> writes:
On Saturday, 14 December 2013 at 04:09:18 UTC, Walter Bright 
wrote:
 On 12/13/2013 12:06 PM, deadalnix wrote:
 I'm reinventing it right now for SDC, so it indeed make sense.
Reinventing EH or inline asm?
EH. Still very similar to what LDC does (which is understandable as it uses LLVM as well).
Dec 13 2013
prev sibling next sibling parent Brad Roberts <braddr puremagic.com> writes:
On 12/13/13 11:07 AM, David Nadlinger wrote:
 On Thursday, 12 December 2013 at 20:46:26 UTC, Walter Bright wrote:
 On 12/12/2013 12:33 PM, Max Samukha wrote:
 Don't you find it somewhat alarming that both alternative compilers follow
 neither the standard inline asm nor ABI?
I find it unfortunate. But it also can be difficult and time consuming to reimplement an assembler for those back ends, so I can understand why it isn't a priority.
LDC in fact implements DMD-style inline assembly (occasionally there are bugs, though, as it's a complete reimplementation). I don't think it would be unreasonable to work towards a common D ABI on the various Posix x86_64 systems, but given that DMD comes with its own bespoke exception handling implementation which doesn't really make sense to implement in GDC/LDC (as libunwind is the platform standard on Linux/… anyway), there is not really much motivation to start work on aligning the other parts of the ABI either. David
I think it's very important to work towards a common D ABI. Even further, I believe it's important for druntime and phobos to be binary compatible between the compilers. That dmd uses a different eh scheme is more a factor of Walter not understanding the standard linux c++ eh mechanism well enough to implement it and instead took the path of least resistance and wrote his own. This is very correctable, just needs someone to do it.
Dec 13 2013
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/13/2013 11:07 AM, David Nadlinger wrote:
 LDC in fact implements DMD-style inline assembly (occasionally there are bugs,
 though, as it's a complete reimplementation).
Thank you! That's awesome!
 I don't think it would be unreasonable to work towards a common D ABI on the
 various Posix x86_64 systems, but given that DMD comes with its own bespoke
 exception handling implementation which doesn't really make sense to implement
 in GDC/LDC (as libunwind is the platform standard on Linux/… anyway), there
is
 not really much motivation to start work on aligning the other parts of the ABI
 either.
I agree that an ABI would be good to work towards. Brad's right about why dmd's EH is different.
Dec 13 2013
parent reply Iain Buclaw <ibuclaw gdcproject.org> writes:
On 14 December 2013 04:08, Walter Bright <newshound2 digitalmars.com> wrote:
 On 12/13/2013 11:07 AM, David Nadlinger wrote:
 LDC in fact implements DMD-style inline assembly (occasionally there are
 bugs,
 though, as it's a complete reimplementation).
Thank you! That's awesome!
The implementation of which existed in GDC first, and was released as dual GPL/BSD license to allow into LDC devs to use and improve (they added 64bit assembler support for instance, years before DMD got 64bit support), and then subsequently dropped from GDC for a number of valid reasons: 1) Transition towards making a platform/target agnostic frontend implementation. 2) Don't and will never implement the DMD-style calling convention, so all inline assembly in druntime and phobos actually doesn't work with GDC - there's actually a bug report about GDC incorrectly pre-defining D_InlineAsm and friends because of this. 3) It is a really big WAT on having two conflicting styles, one for x86, another for everything else. 4) This and other x86 hacks were a problem with code reviewers from GCC. Though saying that, whilst DMD-style was not ideal, neither is GDC-style either, as it requires parser changes, and adds a new class to handle them in the frontend.
Dec 14 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/14/2013 6:08 AM, Iain Buclaw wrote:
 The implementation of which existed in GDC first, and was released as
 dual GPL/BSD license to allow into LDC devs to use and improve (they
 added 64bit assembler support for instance, years before DMD got 64bit
 support),
I didn't know this, thanks for telling me.
 and then subsequently dropped from GDC for a number of valid
 reasons:

 1) Transition towards making a platform/target agnostic frontend
implementation.

 2) Don't and will never implement the DMD-style calling convention, so
 all inline assembly in druntime and phobos actually doesn't work with
 GDC - there's actually a bug report about GDC incorrectly pre-defining
 D_InlineAsm and friends because of this.
dmd's works on multiple platforms and uses version statements to account for ABI differences. It's still easier than having a different syntax.
 3) It is a really big WAT on having two conflicting styles, one for
 x86, another for everything else.
gcc gets them wrong for everything else, too :-)
 4) This and other x86 hacks were a problem with code reviewers from GCC.
I can understand that.
 Though saying that, whilst DMD-style was not ideal, neither is
 GDC-style either, as it requires parser changes, and adds a new class
 to handle them in the frontend.
Dec 14 2013
parent reply Iain Buclaw <ibuclaw gdcproject.org> writes:
On 14 December 2013 17:44, Walter Bright <newshound2 digitalmars.com> wrote:
 On 12/14/2013 6:08 AM, Iain Buclaw wrote:
 The implementation of which existed in GDC first, and was released as
 dual GPL/BSD license to allow into LDC devs to use and improve (they
 added 64bit assembler support for instance, years before DMD got 64bit
 support),
I didn't know this, thanks for telling me.
 and then subsequently dropped from GDC for a number of valid
 reasons:

 1) Transition towards making a platform/target agnostic frontend
 implementation.

 2) Don't and will never implement the DMD-style calling convention, so
 all inline assembly in druntime and phobos actually doesn't work with
 GDC - there's actually a bug report about GDC incorrectly pre-defining
 D_InlineAsm and friends because of this.
dmd's works on multiple platforms and uses version statements to account for ABI differences. It's still easier than having a different syntax.
What I meant was that despite being able to handle IASM syntax at one time. eg: std.math or core.cpuid was altered to specifically work with the ABI GDC used on x86 (cdecl). This was especially noticed from naked assembly, or normal assembly that assumed eg: EAX or the ST registers would be left untouched by the compiler on exiting the function.
 3) It is a really big WAT on having two conflicting styles, one for
 x86, another for everything else.
gcc gets them wrong for everything else, too :-)
From an implementors point of view, it's easier to pass off inline
assembly directly to the assembler, rather than try to manage it in the compiler. I honestly don't know how one would be able to make AsmStatement work for non-x86 architectures. At least this is not possible in GDC unless you want to resort to doing things in a way that are shamed upon (like checking the definition of a particular TARGET macro :)
Dec 14 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/14/2013 11:46 AM, Iain Buclaw wrote:
 I honestly don't know how one would be able to make AsmStatement work
 for non-x86 architectures.  At least this is not possible in GDC
 unless you want to resort to doing things in a way that are shamed
 upon (like checking the definition of a particular TARGET macro :)
I have no idea why it would be hard for non-x86?
Dec 14 2013
parent Iain Buclaw <ibuclaw gdcproject.org> writes:
On Dec 14, 2013 10:51 PM, "Walter Bright" <newshound2 digitalmars.com>
wrote:
 On 12/14/2013 11:46 AM, Iain Buclaw wrote:
 I honestly don't know how one would be able to make AsmStatement work
 for non-x86 architectures.  At least this is not possible in GDC
 unless you want to resort to doing things in a way that are shamed
 upon (like checking the definition of a particular TARGET macro :)
I have no idea why it would be hard for non-x86?
Unlike dmd - gdc (the front end language) doesn't know/doesn't care about what precise platform/target it is compiling for from within gcc's framework. It may know features of the target - pointer size, real type, va_list, which direction the stack grows - just not enough to know which architecture to interpret for. So writing an assembler for ARM was an interesting exercise, but gave zero brownie points in terms of usefulness.
Dec 15 2013
prev sibling parent dennis luehring <dl.soluz gmx.net> writes:
Am 12.12.2013 21:16, schrieb Max Samukha:
 On Thursday, 12 December 2013 at 20:06:37 UTC, Walter Bright
 wrote:
 On 12/12/2013 11:57 AM, Max Samukha wrote:
 On Thursday, 12 December 2013 at 17:56:12 UTC, Walter Bright
 wrote:

 11. inline assembler being a part of the language rather than
 an extension
 that is in a markedly different format for every compiler
Ahem. If we admit that x86 is not the only ISA in exsistence, then what is (under)specified here http://dlang.org/iasm.html is a platform-specific extension.
I know of at least 3 different C x86 inline assembler syntaxes. This is not convenient, to say the least.
I know that too. I appreciate that you attempted to standardize the asm for x86. But the question is what to do about other targets? What about ARM, MIL, LLVM IR or whatever low-level target a D compiler may compile too? Will those be standardized as part of the language?
like freepascal got support for x86 and ARM inline asm (and others) for years?
Dec 12 2013
prev sibling next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Dec 12, 2013 at 08:57:42PM +0100, Max Samukha wrote:
 On Thursday, 12 December 2013 at 17:56:12 UTC, Walter Bright wrote:
 
11. inline assembler being a part of the language rather than an
extension that is in a markedly different format for every
compiler
Ahem. If we admit that x86 is not the only ISA in exsistence, then what is (under)specified here http://dlang.org/iasm.html is a platform-specific extension.
I've always wondered about that. What is D supposed to do with asm blocks when compiling for a CPU that *isn't* x86?? What *should* a conforming compiler do? Translate x86 asm into the target CPU's instructions? Abort compilation? None of those options sound particularly appealing to me. T -- Let's not fight disease by killing the patient. -- Sean 'Shaleh' Perry
Dec 12 2013
next sibling parent Paulo Pinto <pjmlp progtools.org> writes:
Am 12.12.2013 21:08, schrieb H. S. Teoh:
 On Thu, Dec 12, 2013 at 08:57:42PM +0100, Max Samukha wrote:
 On Thursday, 12 December 2013 at 17:56:12 UTC, Walter Bright wrote:

 11. inline assembler being a part of the language rather than an
 extension that is in a markedly different format for every
 compiler
Ahem. If we admit that x86 is not the only ISA in exsistence, then what is (under)specified here http://dlang.org/iasm.html is a platform-specific extension.
I've always wondered about that. What is D supposed to do with asm blocks when compiling for a CPU that *isn't* x86?? What *should* a conforming compiler do? Translate x86 asm into the target CPU's instructions? Abort compilation? None of those options sound particularly appealing to me. T
I already argued a few times here that although inline assembly seems convenient, I do favour the use of external macro assemblers. There will always be some ISAs that are more special than others. So I rather have clean higher level code that drops out to assembly, that having version() for each processor and lack thereof. So far I have only used dmd, but as far as I know both gdc and ldc don't follow the same asm syntax anyway. -- Paulo
Dec 12 2013
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 12/12/2013 12:08 PM, H. S. Teoh wrote:
 I've always wondered about that. What is D supposed to do with asm
 blocks when compiling for a CPU that *isn't* x86??
Give an error. asm blocks should be protected with version statements for the CPU type. The asm format should be what the CPU manufacturer lists as the format in their CPU data sheets.
Dec 12 2013
prev sibling next sibling parent reply Manu <turkeyman gmail.com> writes:
On 13 December 2013 06:08, H. S. Teoh <hsteoh quickfur.ath.cx> wrote:

 On Thu, Dec 12, 2013 at 08:57:42PM +0100, Max Samukha wrote:
 On Thursday, 12 December 2013 at 17:56:12 UTC, Walter Bright wrote:

11. inline assembler being a part of the language rather than an
extension that is in a markedly different format for every
compiler
Ahem. If we admit that x86 is not the only ISA in exsistence, then what is (under)specified here http://dlang.org/iasm.html is a platform-specific extension.
I've always wondered about that. What is D supposed to do with asm blocks when compiling for a CPU that *isn't* x86?? What *should* a conforming compiler do? Translate x86 asm into the target CPU's instructions? Abort compilation? None of those options sound particularly appealing to me.
It occurs to me that a little sugar would be nice, rather than: version(x86) { asm { ... } } else version(ARM) { asm { ... } } Which appears basically everywhere an asm block does. 'asm' could optionally receive an architecture as argument, and lower to the version wrapper: asm(x86) { ... } else asm(ARM) { ... } (The 'else's in those examples seem unnecessary)
Dec 12 2013
parent reply "Daniel Murphy" <yebblies nospamgmail.com> writes:
"Manu" <turkeyman gmail.com> wrote in message 
news:mailman.513.1386905921.3242.digitalmars-d puremagic.com...
 Which appears basically everywhere an asm block does. 'asm' could
 optionally receive an architecture as argument, and lower to the version
 wrapper:

 asm(x86)
 {
  ...
 }
 else asm(ARM)
 {
  ...
 }

 (The 'else's in those examples seem unnecessary)
meh version(x86) asm { } else version(ARM) asm { } else ...
Dec 12 2013
parent Manu <turkeyman gmail.com> writes:
On 13 December 2013 15:39, Daniel Murphy <yebblies nospamgmail.com> wrote:

 "Manu" <turkeyman gmail.com> wrote in message
 news:mailman.513.1386905921.3242.digitalmars-d puremagic.com...
 Which appears basically everywhere an asm block does. 'asm' could
 optionally receive an architecture as argument, and lower to the version
 wrapper:

 asm(x86)
 {
  ...
 }
 else asm(ARM)
 {
  ...
 }

 (The 'else's in those examples seem unnecessary)
meh version(x86) asm { } else version(ARM) asm { } else ...
Haha, okay. Good point. I never think of version statements like that for some reason. Mental throwback to #define perhaps, which requires it's own line >_<
Dec 12 2013
prev sibling parent Manu <turkeyman gmail.com> writes:
On 13 December 2013 13:38, Manu <turkeyman gmail.com> wrote:

 On 13 December 2013 06:08, H. S. Teoh <hsteoh quickfur.ath.cx> wrote:

 On Thu, Dec 12, 2013 at 08:57:42PM +0100, Max Samukha wrote:
 On Thursday, 12 December 2013 at 17:56:12 UTC, Walter Bright wrote:

11. inline assembler being a part of the language rather than an
extension that is in a markedly different format for every
compiler
Ahem. If we admit that x86 is not the only ISA in exsistence, then what is (under)specified here http://dlang.org/iasm.html is a platform-specific extension.
I've always wondered about that. What is D supposed to do with asm blocks when compiling for a CPU that *isn't* x86?? What *should* a conforming compiler do? Translate x86 asm into the target CPU's instructions? Abort compilation? None of those options sound particularly appealing to me.
It occurs to me that a little sugar would be nice, rather than: version(x86) { asm { ... } } else version(ARM) { asm { ... } } Which appears basically everywhere an asm block does. 'asm' could optionally receive an architecture as argument, and lower to the version wrapper: asm(x86) { ... } else asm(ARM) { ... } (The 'else's in those examples seem unnecessary)
The 'else' is useful: asm(x86) { ... } else asm(ARM) { ... } else { static assert("Unsupported architecture!"); // or a fallback implementation }
Dec 12 2013
prev sibling parent reply "Dicebot" <public dicebot.lv> writes:
On Thursday, 12 December 2013 at 17:56:12 UTC, Walter Bright 
wrote:
 that does not even help you in sticking with that crippled 
 subset.
Is there a point to having a compiler flag that'll warn you if you use "pure"?
Ugh, how "pure" is relevant? (I have not tried it but would expect it to work in C-like code, it is just an annotation after all). I am speaking about control about hidden allocations / gc, referring to TypeInfo's and stuff like that. Stuff you get in C out-of-the box because it does no clever magic. In D right now all you can is to get rid of runtime and check for linker errors - not impossible to do, but clearly less convenient than compile-time errors.
 Off the top of my head:

 1. compile speed
Only partially true. Large projects need separate compilation and D does not behave that good in such scenario. Still better than C, but not good enough to make a difference.
 2. dependable sizes of basic types
Not a real issue as your platform SDK always includes some kind of "stdint.h"
 3. unicode
Number one my list of advantages. Does not apply to plenty of projects though.
 4. wchar_t that is actually usable
Same as (3)
 5. thread local storage
It is lot of pain and source of problems, not advantage. Extra work to hack the druntime to get stuff working on barebone.
 6. no global errno being set by the math library functions
This has made me smile :) It shows how different applications we have in mind speaking about "C domain".
 7. proper IEEE 754 floating point
Potentially useful but largely mitigated by platform-specific development.
 8. no preprocessor madness
Would have called this a feature if one could actually use something instead out of the box. But there is no way to control symbol visibility right now so templates / CTFE are often out of the toolset. And what to do without those?
 9. modules
Other than (1) it is also more a problem than help in current implementation - you need to care also about emitted ModuleInfo.
 10. being able to pass array types to functions without them 
 degenerating to pointers
Agreed,
 11. inline assembler being a part of the language rather than 
 an extension that is in a markedly different format for every 
 compiler
Not an issue. You almost always stick to specific compiler in barebone world (one adapted for your platform).
 12. forward referencing (no need to declare everything twice)
Not an issue. C programmers are not tired from typing.
 13. no need for .h files
Same as (9)
 14. no ridonculous struct tag name space with all those silly

     typedef struct S { ... } S;

 declarations.
Nice but just a syntax sugar yet again.
 15. no need for precompiled headers
Same as (9)
 16. struct alignment as a language feature rather than an ugly 
 extension kludge
Same as (11)
 17. no #include guard kludges
OH MY GOD EXTRA <10 LINE OVERHEAD PER HEADER
 18. #define BEGIN { is thankfully not possible
Very tempting but partially mitigated by (8)
 19. no need for global variables when qsorting
Doesn't matter
 20. no global locale madness
(no idea what this means)
 And if you use D features even modestly, such as auto, purity, 
 out variables,  safe, const, etc., you can get a large 
 improvement in clarity in function APIs.
`auto` has no real value in C world because there are not crazy template types. safe is a joke for barebone, you almost never be able to apply it :) Purity, transitive immutability - yeah, those are also top reasons in my list why I'd really love to see D in that domain. But most of those advantages are high-level advantages. To get there and make use of those you need to get through issues that hit you from the very beginning and frustrate _before_ you can see how awesome high-level stuff is: - Allocation for stuff like array literals making those unusable once you remove runtime away - No internal linkage or reliable LTO - no way to take care of unused symbols and control binary size / layout - Requried to stub out TypeInfo / ModuleInfo to use literally anything Note that those are not _fundamental_ language issues. It is possible to fix those via slight tweaks to compiler / spec. But it is state of affairs right now and I need to include time estimates to implement those any time when asked how feasible would have been to use D in that domain (some of my old colleagues have been casually asking me about it).
Dec 13 2013
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/13/2013 6:52 AM, Dicebot wrote:
 1. compile speed
Only partially true. Large projects need separate compilation and D does not behave that good in such scenario. Still better than C, but not good enough to make a difference.
Doesn't behave that good how?
 2. dependable sizes of basic types
Not a real issue as your platform SDK always includes some kind of "stdint.h"
I've been around C code long enough to know it is a real issue. (Few programmers ever use stdint.h, and usually use it incorrectly when they do. Furthermore, those types aren't used by the C standard library, and are only very rarely used by 3rd party C libs. You cannot get away from this problem. It's made even worse because C will silently truncate integers to fit.)
 3. unicode
Number one my list of advantages. Does not apply to plenty of projects though.
It applies more than you might think. Most D apps will be inherently unicode correct. Very, very few C programs are unless the programmer went to some effort to make it so. And, take a look at all those miserable UNICODE macros in windows.h.
 4. wchar_t that is actually usable
Same as (3)
Almost nobody uses wchar_t in C code because it is unusable. Windows uses it for the "W" api functions, but surrogate pairs are broken in just about every C program, because C has no idea what a surrogate pair is. Furthermore, you're just fscked if you try to port wchar_t code from Windows to Linux, you're looking at line-by-line rewrite of all of that code.
 5. thread local storage
It is lot of pain and source of problems, not advantage. Extra work to hack the druntime to get stuff working on barebone.
Making druntime work properly is not a problem for user programming. Most C programs care naught for global shared data, that is, until they try to multithread it. Then it's disasterville.
 6. no global errno being set by the math library functions
This has made me smile :) It shows how different applications we have in mind speaking about "C domain".
Do you mean people don't do math in C apps?
 7. proper IEEE 754 floating point
Potentially useful but largely mitigated by platform-specific development.
I suspect you haven't written serious FP apps. I have, and C's erratic and crappy support for FP is a serious problem. It's so bad that D's math libraries have gradually transitioned towards having our own implementation rather than rely on C's standard library.
 8. no preprocessor madness
Would have called this a feature if one could actually use something instead out of the box. But there is no way to control symbol visibility right now so templates / CTFE are often out of the toolset. And what to do without those?
I have no idea what your issue is here.
 9. modules
Other than (1) it is also more a problem than help in current implementation - you need to care also about emitted ModuleInfo.
Why do you need to care about it?
 11. inline assembler being a part of the language rather than an extension
 that is in a markedly different format for every compiler
Not an issue. You almost always stick to specific compiler in barebone world (one adapted for your platform).
I've had to port C code from one platform to another and had to do complete rewrites of the inline asm, even though they were for the exact same CPU. This is not convenient.
 12. forward referencing (no need to declare everything twice)
Not an issue. C programmers are not tired from typing.
C programs tend to be written "bottom up" to avoid forward references. This is not convenient.
 14. no ridonculous struct tag name space with all those silly

     typedef struct S { ... } S;

 declarations.
Nice but just a syntax sugar yet again.
The tag name space is not convenient.
 15. no need for precompiled headers
Same as (9)
Do you seriously believe C precompiled headers are convenient? Have you ever used them?
 16. struct alignment as a language feature rather than an ugly extension kludge
Same as (11)
Have you ever tried to port C code that uses alignment?
 17. no #include guard kludges
OH MY GOD EXTRA <10 LINE OVERHEAD PER HEADER
Not convenient. It also tends to be sensitive to typos that won't be detected by the compiler, and the usual non-hygienic macro problem. I see those typos now and then in C code I run across.
 19. no need for global variables when qsorting
Doesn't matter
Have you ever used qsort much? It is hardly convenient - you have to write a bunch of boilerplate to use it using global variables, and none of that will work if you've got multiple threads.
 20. no global locale madness
(no idea what this means)
strtod's behavior (for example) is dependent on the current locale. The fact that you didn't know this is indicative of its problems. This is not the only locale-dependent behavior. C has a number of issues with global state.
 And if you use D features even modestly, such as auto, purity, out variables,
  safe, const, etc., you can get a large improvement in clarity in function
APIs.
`auto` has no real value in C world
Going back to the integer type size issue and C's propensity to silently truncate integers makes it a very real issue. --- Yes, you can work around all these issues, but they aren't convenient, which is what this thread is about. But much worse are the increased propensity for silent bugs identified here.
Dec 13 2013
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Dec 13, 2013 at 06:01:10PM -0800, Walter Bright wrote:
 On 12/13/2013 6:52 AM, Dicebot wrote:
[...]
3. unicode
Number one my list of advantages. Does not apply to plenty of projects though.
It applies more than you might think. Most D apps will be inherently unicode correct. Very, very few C programs are unless the programmer went to some effort to make it so. And, take a look at all those miserable UNICODE macros in windows.h.
Yeah, in C, you have to proactively write code to be unicode-compatible. And I daresay a very large percentage of C code are *not*, and it's a major effort to make them compatible.
4. wchar_t that is actually usable
Same as (3)
Almost nobody uses wchar_t in C code because it is unusable. Windows uses it for the "W" api functions, but surrogate pairs are broken in just about every C program, because C has no idea what a surrogate pair is. Furthermore, you're just fscked if you try to port wchar_t code from Windows to Linux, you're looking at line-by-line rewrite of all of that code.
I tried writing wchar_t code before. I tried making it portable by following only the wchar_t functions described in official C standards. I discovered that the C standards are incomplete w.r.t. wchar_t: there are many unspecified and underspecified areas, such as, to take a major example, the non-commitment to providing some means to ensure a Unicode locale. In every official doc that I can find, it depends on setting locale strings, the interpretation of which is "implementation- dependent" (i.e., you're on your own). There isn't even a way to reliably check whether you're currently in a Unicode locale (y'know, when you give up on trying to set the locale string yourself and (questionably) rely on the user to do it, but your code assumes UTF-8 and you need a way to detect an incompatible locale setting). And the semantics of many wchar_t functions are vague and underspecified ("depends on locale setting"), and some key functions are missing, or wrapped behind very inconvenient APIs (*ahem*mbtowcs*wcstomb*cough*). Long story short, let's just say that even writing wchar_t code from scratch is a royal pain in the neck, *and* there's no guarantee the end product will actually work correctly. Unless you reinvent the wheel, disregard wchar_t, and rewrite your own UTF-8 implementation. Don't even speak of converting an existing C program to wchar_t. I was so scarred from the experience that when I saw that D supported unicode natively, I was totally sold. [...]
6. no global errno being set by the math library functions
This has made me smile :) It shows how different applications we have in mind speaking about "C domain".
Do you mean people don't do math in C apps?
Weird. Most of my personal projects (originally C/C++, now D) are math-related. :) [...]
12. forward referencing (no need to declare everything twice)
Not an issue. C programmers are not tired from typing.
C programs tend to be written "bottom up" to avoid forward references. This is not convenient.
I still do that even in D programs, because DMD's handling of forward references is, shall we say, quirky? It works most of the time, but sometimes you get odd errors because certain symbol resolution algorithms used by dmd will produce unexpected results if you don't declare certain symbols beforehand. So it's not completely order-free, but also not completely order-dependent, but something nebulous in between. Me, I play it safe and just write things the C way, so that I never run into these kinds of issues. [...]
20. no global locale madness
(no idea what this means)
strtod's behavior (for example) is dependent on the current locale. The fact that you didn't know this is indicative of its problems. This is not the only locale-dependent behavior. C has a number of issues with global state.
Yeah, like errno, one of the ugliest hacks to be made an official standard. And the entire wchar_t train-wreck, every bit of which is officially declared "locale-dependent", meaning they change their behaviour depending on the locale string you set, and of course the locale strings themselves are "implementation-dependent", so there's basically zero commitment to make portable code possible at all. Sure, to make your program truly portable you do have to invest some effort into it, but given the amount of ugliness you have to endure to work with wchar_t in the first place, you might as well just reinvent your own UTF implementation from scratch (the API would be cleaner, for one thing). And that's not even scratching the surface of things like strtod, like Walter mentioned, that almost everyone *assumes* works a certain way, but may have unexpected results once you insert a setlocale() call into your program. Action-at-a-distance FTW. T -- Do not reason with the unreasonable; you lose by definition.
Dec 14 2013
next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
H. S. Teoh:

 I still do that even in D programs, because DMD's handling of 
 forward references is, shall we say, quirky?
In the last three years it has improved :-) Please submit the remaining bugs on this. Bye, bearophile
Dec 14 2013
prev sibling next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 12/14/2013 06:39 PM, H. S. Teoh wrote:
 I still do that even in D programs, because DMD's handling of forward
 references is, shall we say, quirky? It works most of the time, but
 sometimes you get odd errors because certain symbol resolution
 algorithms used by dmd will produce unexpected results if you don't
 declare certain symbols beforehand. So it's not completely order-free,
 but also not completely order-dependent, but something nebulous in
 between. Me, I play it safe and just write things the C way, so that I
 never run into these kinds of issues.
I couldn't resist the temptation and hence I'm still stuck with 2.060. (And the code base contains quite a few hackarounds to make it compile even with 2.060.) It seems really hard to minimize a reproducible test case. Maybe I should just upload the code to github and file a regression report. (But such a bug report does not really address the underlying problem, which is that the spec is ambiguous about this as well. Most bugs related to this get fixed quickly, but the fixes introduce breakage at other points.)
Dec 14 2013
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/14/2013 9:39 AM, H. S. Teoh wrote:
 On Fri, Dec 13, 2013 at 06:01:10PM -0800, Walter Bright wrote:
 On 12/13/2013 6:52 AM, Dicebot wrote:
I was so scarred from the experience that when I saw that D supported unicode natively, I was totally sold.
Funny story about that. Before I started D, I worked on a C++ project that had to work with Unicode. I ran into all the same issues you did, and also decided that wchar_t was unusable with Unicode. I spent a lot of time getting the Unicode stuff to work correctly. I was so scarred from the experience (!) that I decided that proper Unicode support was an absolute must for D.
 12. forward referencing (no need to declare everything twice)
Not an issue. C programmers are not tired from typing.
C programs tend to be written "bottom up" to avoid forward references. This is not convenient.
I still do that even in D programs, because DMD's handling of forward references is, shall we say, quirky? It works most of the time, but sometimes you get odd errors because certain symbol resolution algorithms used by dmd will produce unexpected results if you don't declare certain symbols beforehand. So it's not completely order-free, but also not completely order-dependent, but something nebulous in between. Me, I play it safe and just write things the C way, so that I never run into these kinds of issues.
dmd's forward reference issues come from a bad design choice in the compiler implementation. I've been gradually fixing the design, and things have gotten a lot better. For example, with the latest release you can forward reference enum members, even in the same enum definition!
Dec 14 2013
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 12/14/2013 08:45 PM, Walter Bright wrote:
 dmd's forward reference issues come from a bad design choice in the
 compiler implementation. I've been gradually fixing the design, and
 things have gotten a lot better. For example, with the latest release
 you can forward reference enum members, even in the same enum definition!
Broke it. string bb(int x, string y){ return "3"; } enum E { foo = bb(cast(int)bar, cast(string)baz), bar=1, baz="2" } Bring it on! :o) https://d.puremagic.com/issues/show_bug.cgi?id=11746
Dec 14 2013
prev sibling next sibling parent Marco Leise <Marco.Leise gmx.de> writes:
Am Fri, 13 Dec 2013 18:01:10 -0800
schrieb Walter Bright <newshound2 digitalmars.com>:

 On 12/13/2013 6:52 AM, Dicebot wrote:
 1. compile speed
Only partially true. Large projects need separate compilation and D does not behave that good in such scenario. Still better than C, but not good enough to make a difference.
Doesn't behave that good how?
Compiling optimized GtkD static libs for all three compilers and a shard lib for dmd uses 7750s user and 553s system CPU time 2 Ghz. That's an average of 35 minutes per complete build. I think this is mostly due to the separate compilation. -- Marco
Dec 14 2013
prev sibling parent "Dicebot" <public dicebot.lv> writes:
I won't continue nitpicking on separate points because it clearly
seems to me we are speaking with completely different
applications in mind. Have you noticed project example I have
described for Manu
(http://forum.dlang.org/post/mznzsfktnzfggckgyeer forum.dlang.org)?
It does not work with application layer strings (thus not
unicode), does no floating point at all. But lack of forced
inlining or control over emitted symbols would have just killed
it, there is no way anyone would even consider language which
does not provide that, it does not matter how awesome it may be
at higher level of abstraction.

I am not going to argue that D can beat C in user-space domain
because pretty much anything can beat C here. It is not even
worth discussing. Interesting part is world where C is still
completely unrivaled - embedded / barebone. D can _possibly_ beat
C here but is very far from there _right now_ because of missing
basics. Once basics are here, lot of advantages you have
mentioned will really start to matter. Not now.
Dec 17 2013
prev sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Dicebot:

 2. dependable sizes of basic types
Not a real issue as your platform SDK always includes some kind of "stdint.h"
On the other hand in D you have less noise. Most D modules you find around use and will use the built-in types, instead of choosing every time different ones. This uniformity is an improvement.
 Would have called this a feature if one could actually use 
 something instead out of the box. But there is no way to 
 control symbol visibility right now so templates / CTFE are 
 often out of the toolset. And what to do without those?
This should be fixed.
 9. modules
Other than (1) it is also more a problem than help in current implementation
Despite their flaws, D modules are quite handy, and better than C way of managing source code files.
 you need to care also about emitted ModuleInfo.
I think LDC2 has a pragma to disable module info generation. Can't something like that be standardized for all D compilers, including D? Have you opened well reasoned enhancement requests for your desires?
 Not an issue. C programmers are not tired from typing.
That's a silly answer. When it works (and it's working more and more), the freedom it gives is handy.
 Nice but just a syntax sugar yet again.
Having less cluttered code is an improvement.
 16. struct alignment as a language feature rather than an ugly 
 extension kludge
Same as (11)
Having built-in standard features save you troubles and work.
 19. no need for global variables when qsorting
Doesn't matter
Why?
 20. no global locale madness
(no idea what this means)
I think Walter refers to this kind of stuff: http://www.cplusplus.com/reference/clocale/
 `auto` has no real value in C world because there are not crazy 
 template types.
"auto" can be handy in C-like code too (but it's less needed because the types are simpler), to avoid repeating struct names, to avoid repeating types two times when you call malloc, etc.
  safe is a joke for barebone, you almost never be able to apply 
 it :)
I think you can have some safe functions in C-style code too :-)
 - Allocation for stuff like array literals making those 
 unusable once you remove runtime away
There is hope to remove this problem: https://github.com/D-Programming-Language/dmd/pull/2952 https://github.com/D-Programming-Language/dmd/pull/2958
 - No internal linkage or reliable LTO  - no way to take care of 
 unused symbols and control binary size / layout
This could be worked on. And now instead of negatives, let's talk about missing positive features. A future replacement for the C language should allow the kind of low level code you use in C, and also offer ways to express more semantics to the compiler, that will be verified and enforced, to make the code safer. Currently the only language I know that is a bit like this is the ATS2 language (http://www.ats-lang.org/ ), and it's partially a failure. (D and Rust are more replacements for C++ than for C). There's still plenty of work to do in the design of low-level languages. A compiler for such (probably imperative) language will probably need to contain a SAT solver, and more inference skills. Bye, bearophile
Dec 14 2013
next sibling parent reply "Paulo Pinto" <pjmp progtools.org> writes:
On Saturday, 14 December 2013 at 17:12:16 UTC, bearophile wrote:
 Dicebot:



  safe is a joke for barebone, you almost never be able to 
 apply it :)
I think you can have some safe functions in C-style code too :-)
Yes, given my experience in Turbo Pascal and Oberon, there are lots of places in C-style code that code be safe as well. For example, there are very few places where dark magic pointer tricks are really essential. Only those code sections really need to be unsafe. -- Paulo
Dec 14 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/14/2013 9:37 AM, Paulo Pinto wrote:
 On Saturday, 14 December 2013 at 17:12:16 UTC, bearophile wrote:
 Dicebot:



  safe is a joke for barebone, you almost never be able to apply it :)
I think you can have some safe functions in C-style code too :-)
Yes, given my experience in Turbo Pascal and Oberon, there are lots of places in C-style code that code be safe as well. For example, there are very few places where dark magic pointer tricks are really essential. Only those code sections really need to be unsafe.
Pretty much all use of pointers in C is unsafe because C cannot statically (or dynamically) verify that the pointers point to valid data, or that arithmetic on those pointers will result in pointers to valid data. This is the huge advantage that D's dynamic arrays have. You can write safe code in C, but you cannot mechanically verify it as safe.
Dec 14 2013
parent Paulo Pinto <pjmlp progtools.org> writes:
Am 14.12.2013 20:33, schrieb Walter Bright:
 On 12/14/2013 9:37 AM, Paulo Pinto wrote:
 On Saturday, 14 December 2013 at 17:12:16 UTC, bearophile wrote:
 Dicebot:



  safe is a joke for barebone, you almost never be able to apply it :)
I think you can have some safe functions in C-style code too :-)
Yes, given my experience in Turbo Pascal and Oberon, there are lots of places in C-style code that code be safe as well. For example, there are very few places where dark magic pointer tricks are really essential. Only those code sections really need to be unsafe.
Pretty much all use of pointers in C is unsafe because C cannot statically (or dynamically) verify that the pointers point to valid data, or that arithmetic on those pointers will result in pointers to valid data. This is the huge advantage that D's dynamic arrays have.
Yes, similar to what is known as open arrays in those languages.
 You can write safe code in C, but you cannot mechanically verify it as
 safe.
True, even with the help of static analysers, it all falls apart the moment you have third party code only available in binary format as libraries. No way to validate them in C. This is where safer languages like D have an edge over C as well. -- Paulo
Dec 15 2013
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 12/14/2013 9:12 AM, bearophile wrote:
 Dicebot:

 2. dependable sizes of basic types
Not a real issue as your platform SDK always includes some kind of "stdint.h"
On the other hand in D you have less noise. Most D modules you find around use and will use the built-in types, instead of choosing every time different ones. This uniformity is an improvement.
It's true that standard C has a zillion names for an integer, and then a typical project will layer on a mass of its own (even dmd does this). The grief comes because pretty much no C project ever uses them consistently and correctly, implicit narrowing conversions hide the bugs, and of course they get messed up with printf formats.
Dec 14 2013
prev sibling parent "ed" <sillymongrel gmail.com> writes:
On Thursday, 12 December 2013 at 09:01:17 UTC, Paulo Pinto wrote:
 On Thursday, 12 December 2013 at 02:12:00 UTC, ed wrote:
 On Wednesday, 11 December 2013 at 03:33:47 UTC, Walter Bright 
 wrote:
 [snip]

 The issue is convenience of writing C code in D vs C.
So you're trying to say that it's easier to write C code in D, rather than in C? I thought this thread was about the inherent advantages of D over C.
I was referring specifically to Dicebot's post as ancestor:
[snip] I am finding C is much easier and more pleasant to write with DMD. At work we're forced, under duress, to write C. I just got a new project with a loose deadline so I thought I'd do a crazy experiment to make it interesting... (NOTE: I say "under duress" but I secretly like C/C++, especially C++11/14.) I'm writing my C code with DMD. When tested and tweaked I do a final compile with C compiler (test once more) then commit for our QA to pick up. Occasionally I'll compile with the C compiler to ensure I haven't leaked any D into the code and to minimise the #include fixups at the end. Currently this is about 20 C-(D) files with approx. 12,000-15,000 LOC. I doubt this workflow would scale much further, although it doesn't look like becoming an issue yet. My experiment is a success IMO. My C code is much cleaner, safer and more maintainable because of it. Yes, I know I could write C like this without DMD ... but I'm lazy and fall back into bad C habits :-) I now advocate that students should be taught C programming with the DMD compiler :D Cheers, Ed
Currently I always advocate that C and C++ development should always be done with warnings as errors enabled, coupled with static analyzers at very least during CI builds, breaking them if anything is found. Nice story though, thanks for sharing. -- Paulo
I agree 100% and do so with my real D code also. While it was a fun experiment I really don't believe think this workflow could ever replace good static analysis tools. Cheers, Ed
Dec 12 2013
prev sibling parent reply "Daniel Murphy" <yebblies nospamgmail.com> writes:
"ed" <sillymongrel gmail.com> wrote in message 
news:ibnfbsvxqzjxyfpnzseh forum.dlang.org...
 I'm writing my C code with DMD. When tested and tweaked I do a final 
 compile with C compiler (test once more) then commit for our QA to pick 
 up.  Occasionally I'll compile with the C compiler to ensure I haven't 
 leaked any D into the code and to minimise the #include fixups at the end.
I used to do this for all my university assignments in C/C++/java.
Dec 12 2013
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 12/12/2013 11:12 AM, Daniel Murphy wrote:
 "ed" <sillymongrel gmail.com> wrote in message
 news:ibnfbsvxqzjxyfpnzseh forum.dlang.org...
 I'm writing my C code with DMD. When tested and tweaked I do a final
 compile with C compiler (test once more) then commit for our QA to pick
 up.  Occasionally I'll compile with the C compiler to ensure I haven't
 leaked any D into the code and to minimise the #include fixups at the end.
I used to do this for all my university assignments in C/C++/java.
I've actually visited a course where submissions in D were allowed.
Dec 12 2013
prev sibling parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Tue, 10 Dec 2013 22:16:25 +0100
schrieb "Adam D. Ruppe" <destructionator gmail.com>:

 On Tuesday, 10 December 2013 at 21:05:53 UTC, Walter Bright wrote:
 At the least, it'll compile a lot faster!
Small C programs compile a *lot* faster than small D programs that use Phobos. import std.stdio; == add half a second to your compile time. $ time dmd hellod.d user 0m0.649s sys 0m0.102s $ time gcc helloc.c user 0m0.095s sys 0m0.039s yikes, even doing printf in D is slow nowadays $ time dmd hellod.d user 0m0.212s sys 0m0.058s Larger D programs do better, of course, at least if you compile all the files at once (and don't use so much CTFE that it starts thrashing the swap file).
Isn't it fairer to compile only (-c): dmd -c std_stdio.d 0,30s user 0,07s system 99% cpu 0,374 total dmd -c printf.d 0,00s user 0,00s system 87% cpu 0,008 total gcc -c printf.c 0,02s user 0,01s system 93% cpu 0,031 total -- Marco
Dec 10 2013
parent Walter Bright <newshound2 digitalmars.com> writes:
On 12/10/2013 3:23 PM, Marco Leise wrote:
 Isn't it fairer to compile only (-c):

 dmd -c std_stdio.d  0,30s user 0,07s system 99% cpu 0,374 total
 dmd -c printf.d     0,00s user 0,00s system 87% cpu 0,008 total
 gcc -c printf.c     0,02s user 0,01s system 93% cpu 0,031 total
Yup, since gcc is not statically linking with the C runtime library, but dmd is statically linking to Phobos.
Dec 10 2013
prev sibling parent "Dicebot" <public dicebot.lv> writes:
On Tuesday, 10 December 2013 at 21:05:53 UTC, Walter Bright wrote:
 On 12/10/2013 12:39 PM, Dicebot wrote:
 I think it is better to rephrase it as "writing C code in D is 
 possible but even
 less convenient than in C".
Why would it be less convenient? At the least, it'll compile a lot faster!
We have been already fighting on that topic several times here and on reddit ;) Probably the most frustrating difference for me are array literals that allocate in D, literally banning its usage from typical C-like code.
Dec 10 2013
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 10/12/13 09:21, H. S. Teoh wrote:
 It turned out that I had overlooked a simple but very significant
 optimization present in the C version that hadn't been implemented
 in the D version yet. [...] In the original C code, it took quite
 a while to implement this optimization because ...  well, in C, you
 had to spell out every last thing, otherwise it just won't work. In
 D, I kicked a crude version of it out in under a day.
There are such amazing multiplicative gains from D's design decisions -- as you describe here, even when you _need_ to drill down and micro-optimize, very often that too can be achieved in a way that is simpler than its C equivalent. And so, at every level of your code, you get that opportunity to focus more on exploration and improvement and less on firefighting.
Dec 10 2013
prev sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Dec 10, 2013 at 10:38:42AM +0100, Joseph Rushton Wakeling wrote:
 On 10/12/13 09:21, H. S. Teoh wrote:
It turned out that I had overlooked a simple but very significant
optimization present in the C version that hadn't been implemented
in the D version yet. [...] In the original C code, it took quite
a while to implement this optimization because ...  well, in C, you
had to spell out every last thing, otherwise it just won't work. In
D, I kicked a crude version of it out in under a day.
There are such amazing multiplicative gains from D's design decisions -- as you describe here, even when you _need_ to drill down and micro-optimize, very often that too can be achieved in a way that is simpler than its C equivalent. And so, at every level of your code, you get that opportunity to focus more on exploration and improvement and less on firefighting.
I like that term "firefighting". :) An apt description of a large proportion of C programming. And I would add, *especially* true in C++ (cf. http://bartoszmilewski.com/2013/09/19/edward-chands/). At a certain point, it just becomes so tiresome to be spending more time fighting the language than actually attacking the problem you're trying to solve. That's not to say everything is perfect in D -- sometimes working around compiler bugs (or issues in the const system) starts to feel a bit like firefighting too. But D gets so many more things right, that I find myself far more productive in D than in C or C++. Exploration is far easier, as you say. Especially in C, the lack of adequate abstraction mechanisms force you to commit to a particular design early, and once you commit to that design, it takes a major overhaul if later on you decide that a different design is better. A lot of time is spent fighting the language, rather than actually moving on with the task at hand. I would say this is another point where D wins over C: sure, if you already know exactly how your final program will look like, you can plan ahead and design it for maximum performance from the get-go in C. But many times you *don't* know beforehand what's the best design, and there's quite a bit of exploration needed before you settle on one. In C, exploration is difficult because it could mean rewriting the whole thing from ground up. In D, far less effort is needed, which gives you more time to work on stability and tuning performance, rather than throwing in a bunch of last-minute hacks just to make things work because you've spent too much time firefighting, and now the deadline's almost up and the product needs to be shipped. T -- This is not a sentence.
Dec 10 2013
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Monday, 9 December 2013 at 19:19:46 UTC, Walter Bright wrote:
 Inlining across source files?
Yes. LTO consist in dumping the IR in the object file. Then the linker can merge all the IR in one giant module and perform global optimizations. This is expansive, but useful to get the last drop of perfs out of release builds.
Dec 09 2013
prev sibling parent "David Nadlinger" <code klickverbot.at> writes:
On Monday, 9 December 2013 at 19:19:46 UTC, Walter Bright wrote:
 On 12/9/2013 6:24 AM, Araq wrote:
 Both GCC and LLVM can perform link time optimizations.
Inlining across source files?
Yes, and MSVC does so too. David
Dec 09 2013
prev sibling next sibling parent reply "ponce" <contact g3mesfrommars.fr> writes:
I work all day with C++ optimization and deal closely with the 
Intel compiler, here is what I have to say. I agree with all 
points but I think 1, 3 and 7 are slightly innacurate.

 1. D knows when data is immutable. C has to always make worst 
 case assumptions, and assume indirectly accessed data mutates.
ICC (and other C++ compilers) has plenty of way to disambiguate aliasing: - a pragma to let the optimizer assume no loop dependency - restrict keyword - /Qalias-const: assumes a parameter of type pointer-to-const does not alias with a parameter of type pointer-to-non-const. - GCC-like strict aliasing rule In most case I've seen, the "no loop dependency" pragma is downright spectacular and gives the most bang for the bucks. Every other methods is annoying and barely useful in comparison. It's not clear to me which aliasing rules D assume.
 3. Function inlining has generally been shown to be of 
 tremendous value in optimization. D has access to all the 
 source code in the program, or at least as much as you're 
 willing to show it, and can inline across modules. C cannot 
 inline functions unless they appear in the same module or in .h 
 files. It's a rare practice to push many functions into .h 
 files. Of course, there are now linkers that can do whole 
 program optimization for C, but those are kind of herculean 
 efforts to work around that C limitation of being able to see 
 only one module at a time.
This point is not entirely accurate. While the C model is generally harmful with inlining, with the Intel C++ compiler you can absolutely rely on cross-module inlining when doing global optimization. I don't know how it works, but all out tiny functions hidden in separate translation units get inlined. ICC also provide 4 very useful pragmas for optimization: {forcing|not forcing} inlining [recursively] at call-point, instead of definition point. I find them better than any inline/__force_inline at definition point.
 7. D's "final switch" enables more efficient switch code 
 generation, because the default doesn't have to be considered.
A good point. The default: branch can be marked unreachable with most C++ compilers I know of. People don't do it though. In my experience, ICC performs sufficient static analysis to be able to avoid the switch prelude test. I don't like it, since it is not desirable for reliable optimization. Would be amazing to have the ICC backend work with a D front-end :) It kicked my ass so many times.
Dec 08 2013
next sibling parent "ponce" <contact g3mesfrommars.fr> writes:
And I agree that all these points are not very important anyway 
since a D program will usually be so much faster to make and to 
refactor anyway.
Dec 08 2013
prev sibling next sibling parent reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 08/12/13 13:35, ponce wrote:
 I work all day with C++ optimization and deal closely with the Intel compiler,
 here is what I have to say. I agree with all points but I think 1, 3 and 7 are
 slightly innacurate.
How is icc doing these days? I used it years ago (almost 10 years ago!) when it produced significantly faster executables than gcc, but I had the impression that more recent gcc releases either matched its performance or significantly narrowed the gap.
Dec 08 2013
parent reply "ponce" <contact g3mesfrommars.fr> writes:
On Sunday, 8 December 2013 at 13:00:26 UTC, Joseph Rushton 
Wakeling wrote:
 How is icc doing these days?  I used it years ago (almost 10 
 years ago!) when it produced significantly faster executables 
 than gcc, but I had the impression that more recent gcc 
 releases either matched its performance or significantly 
 narrowed the gap.
I don't know. People say the gap has reduced a lot and you have to use the #pragmas to get ahead.
Dec 08 2013
parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Sunday, 8 December 2013 at 13:02:56 UTC, ponce wrote:
 On Sunday, 8 December 2013 at 13:00:26 UTC, Joseph Rushton 
 Wakeling wrote:
 How is icc doing these days?  I used it years ago (almost 10 
 years ago!) when it produced significantly faster executables 
 than gcc, but I had the impression that more recent gcc 
 releases either matched its performance or significantly 
 narrowed the gap.
I don't know. People say the gap has reduced a lot and you have to use the #pragmas to get ahead.
My experience is that if you write loads of loops that *are* vectorisable, but not trivially so, and then run on modern intel hardware, intel will beat gcc. Otherwise, probably not. It's become a rather narrow target.
Dec 08 2013
prev sibling next sibling parent reply "qznc" <qznc web.de> writes:
On Sunday, 8 December 2013 at 12:35:45 UTC, ponce wrote:
 1. D knows when data is immutable. C has to always make worst 
 case assumptions, and assume indirectly accessed data mutates.
ICC (and other C++ compilers) has plenty of way to disambiguate aliasing: - a pragma to let the optimizer assume no loop dependency - restrict keyword - /Qalias-const: assumes a parameter of type pointer-to-const does not alias with a parameter of type pointer-to-non-const. - GCC-like strict aliasing rule
To be fair, all of these are unsafe optimizations. You only use them after carefully identifying the hot spot. D immutability is based on a (probably) sound type system and can be used without danger.
Dec 08 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/8/2013 6:26 AM, qznc wrote:
 On Sunday, 8 December 2013 at 12:35:45 UTC, ponce wrote:
 1. D knows when data is immutable. C has to always make worst case
 assumptions, and assume indirectly accessed data mutates.
ICC (and other C++ compilers) has plenty of way to disambiguate aliasing: - a pragma to let the optimizer assume no loop dependency - restrict keyword - /Qalias-const: assumes a parameter of type pointer-to-const does not alias with a parameter of type pointer-to-non-const. - GCC-like strict aliasing rule
To be fair, all of these are unsafe optimizations. You only use them after carefully identifying the hot spot. D immutability is based on a (probably) sound type system and can be used without danger.
To be fairer (!), all of these (except restrict) are non-Standard extensions for C. "restrict" is an extension for C++.
Dec 08 2013
parent reply Jacob Carlborg <doob me.com> writes:
On 2013-12-08 19:44, Walter Bright wrote:

 To be fairer (!), all of these (except restrict) are non-Standard
 extensions for C. "restrict" is an extension for C++.
It doesn't matter they're not standard, as long as people are using them. -- /Jacob Carlborg
Dec 09 2013
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/9/2013 12:11 AM, Jacob Carlborg wrote:
 On 2013-12-08 19:44, Walter Bright wrote:

 To be fairer (!), all of these (except restrict) are non-Standard
 extensions for C. "restrict" is an extension for C++.
It doesn't matter they're not standard, as long as people are using them.
If a language needs extensions in order to be performant, then the language has problems claiming the crown of best performer. Furthermore, you cannot argue that C is more performant when you really mean "Brand X C". That's why I based my remarks on Standard C, not some vendor's extensions.
Dec 09 2013
parent reply Jacob Carlborg <doob me.com> writes:
On 2013-12-09 09:50, Walter Bright wrote:

 If a language needs extensions in order to be performant, then the
 language has problems claiming the crown of best performer.

 Furthermore, you cannot argue that C is more performant when you really
 mean "Brand X C".

 That's why I based my remarks on Standard C, not some vendor's extensions.
Sure, but then this comparison might not be so interesting. I'm trying to say that it might be more interesting in what D can do now and what developers are actually using, including vendor extension. -- /Jacob Carlborg
Dec 09 2013
parent Walter Bright <newshound2 digitalmars.com> writes:
On 12/9/2013 8:03 AM, Jacob Carlborg wrote:
 Sure, but then this comparison might not be so interesting. I'm trying to say
 that it might be more interesting in what D can do now and what developers are
 actually using, including vendor extension.
As pointed out, there are several outstanding examples where D outperforms C.
Dec 09 2013
prev sibling parent reply "Paulo Pinto" <pjmlp progtools.org> writes:
On Monday, 9 December 2013 at 08:11:04 UTC, Jacob Carlborg wrote:
 On 2013-12-08 19:44, Walter Bright wrote:

 To be fairer (!), all of these (except restrict) are 
 non-Standard
 extensions for C. "restrict" is an extension for C++.
It doesn't matter they're not standard, as long as people are using them.
Not when writing portable code. Nowadays I just do JVM/.NET stuff, but I still remember the headaches of writing portable C and C++ code across commercial UNIX systems during 1999 - 2001. -- Paulo
Dec 09 2013
parent reply "Szymon Gatner" <noemail gmail.com> writes:
On Monday, 9 December 2013 at 09:58:01 UTC, Paulo Pinto wrote:
 Not when writing portable code.

 Nowadays I just do JVM/.NET stuff, but I still remember the
 headaches of writing portable C and C++ code across commercial
 UNIX systems during 1999 - 2001.
Don't know about C much but writing portable C++ is not a pleasant task, especially now, when you really want to use C++11 features. Saying that nobody cares about portable code is of course completely ignorant. I would even say that today it is more relevant that single-platfrm code.
Dec 09 2013
parent "Paulo Pinto" <pjmlp progtools.org> writes:
On Monday, 9 December 2013 at 10:03:31 UTC, Szymon Gatner wrote:
 On Monday, 9 December 2013 at 09:58:01 UTC, Paulo Pinto wrote:
 Not when writing portable code.

 Nowadays I just do JVM/.NET stuff, but I still remember the
 headaches of writing portable C and C++ code across commercial
 UNIX systems during 1999 - 2001.
Don't know about C much but writing portable C++ is not a pleasant task, especially now, when you really want to use C++11 features. Saying that nobody cares about portable code is of course completely ignorant. I would even say that today it is more relevant that single-platfrm code.
Many people in the open source world equate writing portable C code with using gcc and clang across Linux/BSD distributions, while complaining about Windows. However when using commercial C compilers across many systems. The C language standard is full of undefined behaviors that will bite you, once you start using multiple vendors. Then there was the POSIX standard, with its own set of undefined behaviors and vendor specific extensions. Additionally when I was in this project, some of the C compilers were still midway between K&R and ANSI C compliance. Thankfully most of the application was written in TCL with OS specific C bindings. As for C++, it has been always like that. One of the reasons I liked Java so much in the beginning was it is a C++ like language, with Pascal like safety, that I could use everywhere without having #ifdef everywhere on my code, always wondering what language constructs would be supported. Writing portable C++ code before the C++98 standard was finalized was a guessing game of what each compiler supported what, and how far it was from the ongoing draft. So if you wanted to play safe, the C++ code was basically a better C with in-house developed classes. -- Paulo
Dec 09 2013
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 12/8/2013 4:35 AM, ponce wrote:
 3. Function inlining has generally been shown to be of tremendous value in
 optimization. D has access to all the source code in the program, or at least
 as much as you're willing to show it, and can inline across modules. C cannot
 inline functions unless they appear in the same module or in .h files. It's a
 rare practice to push many functions into .h files. Of course, there are now
 linkers that can do whole program optimization for C, but those are kind of
 herculean efforts to work around that C limitation of being able to see only
 one module at a time.
This point is not entirely accurate. While the C model is generally harmful with inlining, with the Intel C++ compiler you can absolutely rely on cross-module inlining when doing global optimization. I don't know how it works, but all out tiny functions hidden in separate translation units get inlined.
I believe this is the linker thing I mentioned at work.
Dec 08 2013
prev sibling parent reply "Sean Kelly" <sean invisibleduck.org> writes:
On Friday, 6 December 2013 at 22:20:19 UTC, Walter Bright wrote:
 "there is no way proper C code can be slower than those 
 languages."
Didn't Bjarne cover this in his C++ performance talk at SD West in 2007? Templates alone can make C++ and D code faster than even hand-optimized C. And that doesn't even consider some of the other points you mentioned.
Dec 10 2013
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Dec 10, 2013 at 11:47:47PM +0100, Sean Kelly wrote:
 On Friday, 6 December 2013 at 22:20:19 UTC, Walter Bright wrote:
"there is no way proper C code can be slower than those
languages."
Didn't Bjarne cover this in his C++ performance talk at SD West in 2007? Templates alone can make C++ and D code faster than even hand-optimized C. And that doesn't even consider some of the other points you mentioned.
The thing is, what constitutes "proper C" is not well-defined, because since C translates to machine code (as does C++ and D), in theory *everything* has access to the same level of performance -- that is, the hardware. So arguably, no matter what code fragment you may present in C++ or D, there's always a corresponding C code fragment that performs equally fast or faster. But that obscures the fact that said C code fragment may be written in an unmanageably convoluted style that no one in their right mind would actually use in practice. (And the same can be said for C++ and D: use asm blocks, and you'll beat any "normal" C code, but that proves nothing since the whole issue is writing *idiomatic* C vs. *idiomatic* D, not writing things in an unnatural way just so you can lay claim to the title of best performance.) T -- No! I'm not in denial!
Dec 10 2013
next sibling parent reply "Sean Kelly" <sean invisibleduck.org> writes:
On Wednesday, 11 December 2013 at 00:19:50 UTC, H. S. Teoh wrote:
 On Tue, Dec 10, 2013 at 11:47:47PM +0100, Sean Kelly wrote:
 On Friday, 6 December 2013 at 22:20:19 UTC, Walter Bright 
 wrote:
"there is no way proper C code can be slower than those
languages."
Didn't Bjarne cover this in his C++ performance talk at SD West in 2007? Templates alone can make C++ and D code faster than even hand-optimized C. And that doesn't even consider some of the other points you mentioned.
The thing is, what constitutes "proper C" is not well-defined, because since C translates to machine code (as does C++ and D), in theory *everything* has access to the same level of performance -- that is, the hardware. So arguably, no matter what code fragment you may present in C++ or D, there's always a corresponding C code fragment that performs equally fast or faster. But that obscures the fact that said C code fragment may be written in an unmanageably convoluted style that no one in their right mind would actually use in practice. (And the same can be said for C++ and D: use asm blocks, and you'll beat any "normal" C code, but that proves nothing since the whole issue is writing *idiomatic* C vs. *idiomatic* D, not writing things in an unnatural way just so you can lay claim to the title of best performance.)
Bjarne's point was essentially that templates allow code to be inlined to a ridiculous degree. C code simply can't compete with that and still be maintainable. I suppose you could expand that to mean any inlineable code rather than just templates, but the same assertion holds. I don't think it makes sense to consider the scenario where the "C code" is purpose built assembly code because that says nothing about the language itself. Nor does it make sense to restrict the compared language to "C style code" (whatever that means) because any speed advantages would be leveraging language features or conventions not present in idiomatic C. For example, I've posted benchmarks of a JSON parser here before that I wrote in C and it's considerably faster than anything I've seen done in D. Does that mean that C is faster than D? Absolutely not. I could recompile the same code in D and have it be just as fast as the C version. The more interesting issue is how easy that problem was to solve in each language, and which is more maintainable.
Dec 10 2013
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Dec 11, 2013 at 01:33:24AM +0100, Sean Kelly wrote:
 On Wednesday, 11 December 2013 at 00:19:50 UTC, H. S. Teoh wrote:
On Tue, Dec 10, 2013 at 11:47:47PM +0100, Sean Kelly wrote:
On Friday, 6 December 2013 at 22:20:19 UTC, Walter Bright wrote:
"there is no way proper C code can be slower than those languages."
Didn't Bjarne cover this in his C++ performance talk at SD West in 2007? Templates alone can make C++ and D code faster than even hand-optimized C. And that doesn't even consider some of the other points you mentioned.
The thing is, what constitutes "proper C" is not well-defined, because since C translates to machine code (as does C++ and D), in theory *everything* has access to the same level of performance -- that is, the hardware. So arguably, no matter what code fragment you may present in C++ or D, there's always a corresponding C code fragment that performs equally fast or faster. But that obscures the fact that said C code fragment may be written in an unmanageably convoluted style that no one in their right mind would actually use in practice. (And the same can be said for C++ and D: use asm blocks, and you'll beat any "normal" C code, but that proves nothing since the whole issue is writing *idiomatic* C vs. *idiomatic* D, not writing things in an unnatural way just so you can lay claim to the title of best performance.)
Bjarne's point was essentially that templates allow code to be inlined to a ridiculous degree. C code simply can't compete with that and still be maintainable. I suppose you could expand that to mean any inlineable code rather than just templates, but the same assertion holds. I don't think it makes sense to consider the scenario where the "C code" is purpose built assembly code because that says nothing about the language itself. Nor does it make sense to restrict the compared language to "C style code" (whatever that means) because any speed advantages would be leveraging language features or conventions not present in idiomatic C. For example, I've posted benchmarks of a JSON parser here before that I wrote in C and it's considerably faster than anything I've seen done in D. Does that mean that C is faster than D? Absolutely not. I could recompile the same code in D and have it be just as fast as the C version. The more interesting issue is how easy that problem was to solve in each language, and which is more maintainable.
Right, that was what I was getting at. And I think on that count, D trumps C because idiomatic D is both of comparable performance *and* more maintainable, whereas highly-optimized C is unreadable. Not to mention D is easier to write (and write *correctly* -- and again I'm reminded of Milewski's article about how the naïve way to write C++, which I think also applies to C, is more often than not the wrong way, and the right way is very convoluted and unnatural). T -- Elegant or ugly code as well as fine or rude sentences have something in common: they don't depend on the language. -- Luca De Vitis
Dec 10 2013
prev sibling parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Wednesday, 11 December 2013 at 00:19:50 UTC, H. S. Teoh wrote:
 hardware. So arguably, no matter what code fragment you may  
 present in
 C++ or D, there's always a corresponding C code fragment that 
 performs
 equally fast or faster.
Yes, but the unix/C-way is to have many simple programs that work together to make complex systems. C is meant to be used with makefiles and source-generating tools such as Lex, Ragel and even the C-preprocessor. C++ and D claim to be self-sufficient. C was never meant to be, it was meant to be part of the unix eco-system. What you do with templates and and compile-time-expressions in C++/D is what you do with the ecosystem of tools in C. Therefore a comparison between C and C++/D should include that C-ecosystem. If people don't like sourcecode-generating tools, fine, but that is the unix/C way of programming and it should be included when assessing the power of C versus C++/D (and their template libraries).
 But that obscures the fact that said C  code
 fragment may be written in an unmanageably convoluted style 
 that no one in their right mind would actually use in practice.
Well, C-programmers do, if they have tools that generate that convoluted style from a readable input file (like lex).
 but that proves nothing since the whole issue is writing  
 *idiomatic* C
 vs. *idiomatic* D, not writing things in an unnatural way just 
 so you can lay claim to the title of best performance.)
Exactly, and idiomatic C is to use source-generating tools. Just about all medium to large size C projects use such tools that go beyond the C-preprocessor (which conceptually is a separate tool that is optional in theory). Anyway, one cannot discuss performance without discussing the target. Much of the stuff in C makes sense on memory-constrained hardware, even C-strings are great when you want to conserve memory and have hardware-support for 0-guarded strings (string-instructions that will stop on 0). And, JITed regexps won't work on mobile platforms or platforms that require signed code. We are now getting a new range of memory constrained hardware, transputer-like processsors with many simple cores with fast local memory and a saturated link to main memory. So the memory-efficent way of getting performance is still highly relevant. Performance is always contextual. E.g. I think OpenCL is just an intermediate step to getting performance, compilers will soon have to emit co-processor friendly code automagically and languages will have to provide constructs that makes that happen in the most efficient way. So if C is out-dated then so are all other languages in current use… ;-) O.
Dec 16 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/16/13 5:35 AM, "Ola Fosheim Grøstad" 
<ola.fosheim.grostad+dlang gmail.com>" wrote:
 On Wednesday, 11 December 2013 at 00:19:50 UTC, H. S. Teoh wrote:
 hardware. So arguably, no matter what code fragment you may present in
 C++ or D, there's always a corresponding C code fragment that performs
 equally fast or faster.
Yes, but the unix/C-way is to have many simple programs that work together to make complex systems. C is meant to be used with makefiles and source-generating tools such as Lex, Ragel and even the C-preprocessor. C++ and D claim to be self-sufficient. C was never meant to be, it was meant to be part of the unix eco-system. What you do with templates and and compile-time-expressions in C++/D is what you do with the ecosystem of tools in C. Therefore a comparison between C and C++/D should include that C-ecosystem. If people don't like sourcecode-generating tools, fine, but that is the unix/C way of programming and it should be included when assessing the power of C versus C++/D (and their template libraries).
 But that obscures the fact that said C  code
 fragment may be written in an unmanageably convoluted style that no
 one in their right mind would actually use in practice.
Well, C-programmers do, if they have tools that generate that convoluted style from a readable input file (like lex).
 but that proves nothing since the whole issue is writing *idiomatic* C
 vs. *idiomatic* D, not writing things in an unnatural way just so you
 can lay claim to the title of best performance.)
Exactly, and idiomatic C is to use source-generating tools. Just about all medium to large size C projects use such tools that go beyond the C-preprocessor (which conceptually is a separate tool that is optional in theory).
Nonsense. Using extralinguistic tools including code generators is not the exclusive appurtenance of C. Any large project uses some for various purposes. Needless to say, extralinguistic generators often compare poorly with language-integrated solutions. Look where the preprocessor has taken C - it's compromised the entire notion of preprocessing. And m4, more powerful and supposedly better, has only spawned more madness.
 Anyway, one cannot discuss performance without discussing the target.
 Much of the stuff in C makes sense on memory-constrained hardware, even
 C-strings are great when you want to conserve memory and have
 hardware-support for 0-guarded strings (string-instructions that will
 stop on 0).  And, JITed regexps won't work on mobile platforms or
 platforms that require signed code.

 We are now getting a new range of memory constrained hardware,
 transputer-like processsors with many simple cores with fast local
 memory and a saturated link to main memory. So the memory-efficent way
 of getting performance is still highly relevant.

 Performance is always contextual. E.g. I think OpenCL is just an
 intermediate step to getting performance, compilers will soon have to
 emit co-processor friendly code automagically and languages will have to
 provide constructs that makes that happen in the most efficient way. So
 if C is out-dated then so are all other languages in current use… ;-)
Current applications also demand good modeling power. The days when one thousand lines was a nontrivial program are behind us. The right solution is a language that combines performance control with the modeling required by large applications. Andrei
Dec 16 2013
parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Monday, 16 December 2013 at 18:20:23 UTC, Andrei Alexandrescu 
wrote:
 Nonsense. Using extralinguistic tools including code generators 
 is not the exclusive appurtenance of C.
Not sure what you mean is nonsense. In general, having to resort to macros and source-generating tools have been seen as a weakness of the semantics of the language. In most dynamic languages you never have to do that (because you can eval()). However, the Unix/C philosophy has always been that of having a conglomerate of smaller programs to build larger systems. It is not a "quick fix", it is in line with the basic philosophy of having many simple tools. Hence it is idiomatic. When you reach the complexity of C++/D you really should not be required to resort to such techniques. C is a simple language that used to have simple compilers.
 And m4, more powerful and supposedly better, has only spawned 
 more madness.
m4 is quite powerful, but macro-processors are annoying, if that is what you are implying. I've only used m4 to address limitations in more limited languages.
 Current applications also demand good modeling power. The days 
 when one thousand lines was a nontrivial program are behind us. 
 The right solution is a language that combines performance 
 control with the modeling required by large applications.
M… There are two schools of object-orientation: that of object oriented design/modelling and that of object oriented programming. If you don't skip the modelling part you can write OO in most languages. Many programmers skip the modelling part and think that doing OOP is sufficient. It is isn't. OOP is about ADTs. OOD is about the understanding the domain, where you need flexibility in the future etc, the language is not the most important aspect of getting good structure. You can probably do a good job even in PHP (which is a very crappy language) if your analysis of the domain is good. If you want more modelling power you should look at Beta, Datalog/Prolog etc… D doesn't provide more modelling power than other imperative OO languages. Does it? Are the objects nested like in Beta? Do you have the ability to do virtual inheritance on classes? Do you have the ability to extend virtual functions by calling the sub-class from the superclass (inner-statements rather than outer, so that you can enforce transactional BEGIN/END clauses?) But that was not my point. My point was that performance on the CPU is going to be less important because it account for only 10% of the total performance. GPUs now do TerraFLOPs. FPGAS are being packaged with CPUs by Zynq, OpenCL will be able to compile to FPGAs. So basically, to get performance the language should support more than just the CPU and that is not easy because of the potential bottlenecks between the subsystems.
Dec 16 2013
parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Monday, 16 December 2013 at 19:08:49 UTC, Ola Fosheim Grøstad 
wrote:
 10% of the total performance. GPUs now do TerraFLOPs. FPGAS are
Tera… ;)
Dec 16 2013