digitalmars.D - The God Language

Walter Bright (1/1) Dec 29 2011 http://pastebin.com/AtuzJqh0

Max Samukha (3/4) Dec 29 2011 He will soon realize that he wants an earthborn language rather than the...

Walter Bright (2/6) Dec 29 2011 Watch out, or you may attract a thunderbolt!!

Caligo (9/10) Dec 29 2011 This is somewhat of a serious question: If there is a God (I'm not sayi...

Vladimir Panteleev (2/15) Dec 29 2011 Obligatory XKCD: http://xkcd.com/224/
Gour (13/17) Dec 29 2011 There is. ;)

Caligo (5/7) Dec 29 2011 You are asking about creationism and evolution, aren't you? I have to s...
maarten van damme (7/7) Dec 29 2011 I think it would be an object oriented language, I'm a believer in the

Nick Sabalausky (12/20) Jan 02 2012 I heard on the Science Channel that M-theory was becoming favored over

Timon Gehr (2/24) Jan 02 2012 God cannot be omnipotent. If he was, he could invent a task he cannot so...

Caligo (2/41) Jan 02 2012 He has; the human race.
Gour (11/13) Jan 02 2012 Wrong. He is not static, but dynamic, so He can invent a task he cannot

Timon Gehr (2/10) Jan 02 2012 I meant he can invent a task he will never be able to solve. ;)

Gour (10/11) Jan 02 2012 Nah...those are just side-effects, iow. noise. :-D
Nick Sabalausky (12/28) Jan 03 2012 I've never felt that argument to be particularly compelling: I see it as...

Walter Bright (3/5) Jan 03 2012 I don't know, but I'm sure he could make a product that is both a floor ...

Nick Sabalausky (3/8) Jan 03 2012 I'm having visions of Billy Mays...

Walter Bright (2/11) Jan 03 2012 Wrong reference! Google "floor wax and dessert topping".

J Arrizza (14/16) Jan 02 2012 Are you sure? There is good evidence he strongly prefers gc's. Consider
maarten van damme (11/29) Jan 03 2012 there is no destruction/creation going on, energy is constant at all tim...
J Arrizza (21/31) Jan 03 2012 I see. Something like "Matter is neither created nor destroyed...".

Jacob Carlborg (4/15) Dec 29 2011 Servers in the cloud of course :)
Andrei Alexandrescu (3/14) Dec 29 2011 Obligatory: http://xkcd.com/224/
Walter Bright (2/5) Dec 29 2011 Mathematics.

so (6/12) Dec 29 2011 In the essence and the spirit math is THE answer,
FeepingCreature (4/10) Dec 29 2011 Fan of Tegmark�, eh? :)

=?utf-8?Q?Simen_Kj=C3=A6r=C3=A5s?= (6/17) Jan 02 2012 o =

Don (4/15) Dec 29 2011 Declarative.
bcs (5/13) Jan 04 2012 I have two contradictory answers:

Vladimir Panteleev (57/63) Dec 29 2011 I would like to talk about this for a bit. Personally, I think

=?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (6/65) Dec 29 2011 +1. D needs a way to force inlining. The compiler can, at best, do

so (7/12) Dec 29 2011 The legitimate "D performs so bad in my example" posts appeared in this ...

Walter Bright (3/6) Dec 29 2011 Standard C doesn't have one either. C vendors often implement vendor-spe...

Kapps (5/5) Dec 29 2011 Agreed.

a (33/39) Dec 29 2011 The problem is not just inlining but also needless loads and stores at t...

David Nadlinger (5/22) Dec 29 2011 Yes, this is indeed a problem, and as far as I'm aware, usually solved

Paulo Pinto (3/29) Dec 29 2011 Specially because some 64 bit compilers are providing intrinsics as the ...
a (30/56) Dec 29 2011 IIRC Walter doesn't want to add vector intrinsics, so it would be nice i...

Walter Bright (16/19) Dec 29 2011 This does what you're asking for:

a (2/23) Dec 29 2011 What I want is to be able to write short functions using inline assembly...

Walter Bright (3/7) Dec 29 2011 I understand. I just wished to make sure you knew about 'naked' and what...

Peter Alexander (6/65) Dec 29 2011 +1
bearophile (5/19) Dec 29 2011 LDC has a mean to inline functions with asm, and asm expressions. DMD to...
Don (12/38) Dec 29 2011 I don't think the situation is any different with DMC. I think that if D...

Vladimir Panteleev (5/13) Dec 29 2011 I've never seen DMD emit rep movsd. Does rep movsd even make

Don (6/18) Dec 29 2011 It's in the backend in cod2.c, line 3260. But on closer inspection --

Vladimir Panteleev (5/8) Dec 29 2011 You're right... I've never extensively used a C/C++ compiler

so (4/12) Dec 29 2011 Well i remember at most one or two supported me when i brought it up and...
bearophile (5/8) Dec 29 2011 Right. (This is why once I have asked for a explicitly not implemented c...

Vladimir Panteleev (4/7) Dec 29 2011 C macros are a crude form of inlining. String mixins do not scale

Walter Bright (287/289) Dec 29 2011 Challenge accepted.

Andrei Alexandrescu (4/9) Dec 29 2011 [snip]
Walter Bright (14/18) Dec 29 2011 This does compile, though I did not test or benchmark it.

so (7/21) Dec 29 2011 Yet every big C/C++ compiler has to support it, no?

Walter Bright (2/6) Dec 29 2011 You can do a pull request for it, and we can evaluate it.

Walter Bright (290/294) Dec 29 2011 Here's another version that uses string mixins to ensure inlining of the...

Andrei Alexandrescu (4/14) Dec 29 2011 [snip]

Nick Sabalausky (3/17) Jan 02 2012 Tab is indeed evil when certain people insist it should be size 8 ;)

Vladimir Panteleev (31/48) Dec 29 2011 Ah, a direct translation using functions! This is probably the

Walter Bright (27/41) Dec 29 2011 The programmer also has no control over which variables go into which re...

Vladimir Panteleev (30/48) Dec 30 2011 Even though the core language (of C and D) are not specific to

Walter Bright (16/18) Dec 30 2011 Adding a keyword won't fix the current problem that the compiler won't i...

Peter Alexander (15/20) Dec 30 2011 When you are writing really performance sensitive code, that old adage
Vladimir Panteleev (12/22) Dec 30 2011 Which is exactly why I think an inlining pragma/attribute should

Vladimir Panteleev (5/6) Dec 30 2011 Oops, didn't mean to send that. I was going to write that
so (9/18) Dec 30 2011 I agree @inline (which will probably be an extension) in D should mean

Walter Bright (3/9) Dec 30 2011 Sure, but I think you'll be very disappointed in that it isn't going to ...

Chad J (3/16) Dec 30 2011 Cool. Put it in and let people use it and get disappointed. Then maybe
so (143/154) Dec 30 2011 dmd_inl -O -inline test.d

Iain Buclaw (8/41) Dec 30 2011 A better compiler would see that the function 'test' has no side

so (3/48) Dec 30 2011 It is just a dummy function that dmd rejected to inline, send me a bette...

Iain Buclaw (11/68) Dec 30 2011 n

so (56/59) Dec 30 2011 Well not them but another dummy function, i didn't think it would differ...

Walter Bright (4/5) Dec 30 2011 It differs that much because once it is inlined, the optimizer deletes i...

so (9/15) Dec 30 2011 Yes i can see from asm output but are we talking about same thing?

Walter Bright (5/7) Dec 30 2011 Because if the function did anything useful, the overhead of the functio...

Mike Wey (19/54) Dec 31 2011 When marking the function as pure and nothrow dmd is able to optimize

Iain Buclaw (6/67) Dec 31 2011 Yep, as I've mentioned earlier, the function has no side effects and

so (4/7) Dec 30 2011 I don't understand your point btw, why it shouldn't be easily folded awa...

Chad J (36/62) Dec 30 2011 When a compiler ISN'T past a certain level of heuristic inlining, then

Walter Bright (16/31) Dec 30 2011 Back in the olden days, I provided a detailed list of optimizer switches...

Chad J (3/45) Dec 30 2011 Huh, bummer dudes.

Trass3r (12/19) Dec 30 2011 More specifically a distinction would be nice like gcc does:

Martin Nowak (9/53) Jan 03 2012 For real performance bottlenecks one should always examine the assembly.

Vladimir Panteleev (3/11) Jan 04 2012 Quality of implementations' optimizations and a common syntax for

Timon Gehr (17/78) Dec 29 2011 That does not mean the language does not support it, probably ldc and

David Nadlinger (4/9) Dec 29 2011 LDC has pragma(allow_inline), which allows you to mark a function
Vladimir Panteleev (8/9) Dec 29 2011 You're missing my point. We can't count that optimizers in all

Timon Gehr (339/344) Dec 29 2011 Ok, I have performed a direct translation (with all the preprocessor

Vladimir Panteleev (4/5) Dec 29 2011 Good work, but I'm not sure if inventing a DSL to make up for the

Timon Gehr (4/9) Dec 30 2011 It certainly does. That is how all my code generation looks like. The

Vladimir Panteleev (4/17) Dec 30 2011 Never mind. You're right. I hadn't thought of this before (using

so (5/21) Dec 30 2011 For me, mixin sounds much more intuitive than inline for what we are

Walter Bright (3/6) Dec 30 2011 I your solution to parameterized strings is very nice. Can you write a b...

Andrei Alexandrescu (7/14) Dec 30 2011 The idea is good, but nonhygienic: the macro's expansion picks up

Timon Gehr (41/55) Dec 30 2011 What the template 'X' currently achieves is an improvement in syntax:

Andrei Alexandrescu (14/35) Dec 30 2011 I understand that. But the whole system must be redesigned. Quoting from...

Timon Gehr (7/46) Dec 30 2011 I understand, but compared to how I solved the issue

Walter Bright (3/6) Dec 30 2011 That's characteristic of how macros work, and people want it that way.

Timon Gehr (8/16) Dec 30 2011 I don't think that is true. It's an undesirable characteristic. I don't

Andrei Alexandrescu (9/26) Dec 30 2011 Fair enough. I think your idea of defining a mini-macro-expansion system...

Timon Gehr (6/33) Dec 30 2011 I think what you propose is a lot closer to C macros than what I already...

Andrei Alexandrescu (3/5) Dec 30 2011 I think it would be great to reproduce the expansion semantics of ddoc.

Timon Gehr (5/10) Dec 30 2011 So basically, just breaking infinite recursion on recursive identical

Walter Bright (5/16) Dec 30 2011 Because inevitably someone will write:

Timon Gehr (4/24) Dec 30 2011 Makes sense, but why is it an issue if expansion is explicit?

Walter Bright (2/13) Dec 30 2011 Because the expanded text is then rescanned for further macro replacemen...

Timon Gehr (2/18) Dec 30 2011 Yes, but in q{a + FOO} there is none.

Walter Bright (4/24) Dec 30 2011 #define FOO a+Foo

Walter Bright (4/29) Dec 30 2011 Blast, I meant

Timon Gehr (4/37) Dec 31 2011 FOO; -> a + FOO;

Nick Sabalausky (9/11) Jan 02 2012 Ewww, who in the world uses double-quote strings for code containing quo...

Timon Gehr (3/17) Jan 02 2012 What if the code contains both " and `? Using `` strings for code that

Jonathan M Davis (3/5) Jan 02 2012 True. But it's a solution that works most of the time.

Timon Gehr (2/9) Dec 30 2011 Ok.

Manu (49/104) Jan 04 2012 This conversation has meandered into one very specific branch, but I jus...

bearophile (4/8) Jan 04 2012 Currently the built-in vector operations of D are not optimized, their s...

Manu (11/22) Jan 04 2012 I'm not referring to vector OPERATIONS. I only refer to the creation of ...

bearophile (8/10) Jan 04 2012 Please, try to step back a bit and look at this problem from a bit more ...

Peter Alexander (20/30) Jan 04 2012 D has no alignment support, so there is no way to specify that you want
Manu (51/72) Jan 05 2012 These are all fundamentally different types, like int and long.. float a...

Timon Gehr (12/16) Jan 05 2012 D is not a compiler, it is a language. Furthermore it is not true that

Manu (48/52) Jan 05 2012 Sorry, I was generalising a little general in that claim. And where I sa...

Andrei Alexandrescu (14/21) Jan 05 2012 Your point is well meaning. I trust you understood and internalized your...

Manu (23/43) Jan 05 2012 I do realise all the implementation details you suggest. My core point i...

Walter Bright (4/7) Jan 05 2012 That's a common phenomena, known as bikeshedding. Issues that are easy t...

Sean Kelly (9/11) Jan 05 2012 raised. I don't know why this one became the most popular for =

Peter Alexander (51/55) Jan 05 2012 It's not idealistic. For example, in my current project, I have a 3x

Manu (61/71) Jan 05 2012 I think they are *all* language requests. They could be implemented by 3...
Manu (61/71) Jan 05 2012 I think they are *all* language requests. They could be implemented by 3...
Sean Kelly (57/75) Jan 05 2012 raised. I don't know why this one became the most popular for =
Manu (35/82) Jan 05 2012 Precisely. That's all I want :) .. formal definition of these concepts,
Iain Buclaw (31/72) Jan 05 2012 ismiss
Sean Kelly (38/54) Jan 05 2012 around inline asm blocks, this makes them useless, and I'll bet GDC =
Manu (5/24) Jan 05 2012 ... shit.
Manu (13/24) Jan 05 2012 It's proven then, inline asm blocks break the optimiser... they are
Iain Buclaw (15/41) Jan 05 2012 r
Artur Skawina (3/5) Jan 05 2012 -fstrict-aliasing is already turned on by default in gdc...
Manu (9/14) Jan 05 2012 When I was first reading about D I read that the inline assembler syntax...
Iain Buclaw (32/47) Jan 05 2012 For all its intentions, I think D-style syntax is great, however for
Sean Kelly (8/12) Jan 05 2012 optimize across asm blocks. This should probably be revisited at some =

Walter Bright (2/5) Jan 05 2012 dmd does keep track of register usage within asm blocks.

Sean Kelly (6/12) Jan 05 2012 unsafe

Walter Bright (3/6) Jan 07 2012 Right. More generally, it does not do data flow analysis within an asm b...

Walter Bright (5/15) Jan 05 2012 There are some. One that is currently being exploited by the optimizer a...

Manu (3/10) Jan 05 2012 Does GDC currently support these same optimisations, or is this a DMD

Walter Bright (2/10) Jan 05 2012 I don't know what GDC does.

Iain Buclaw (10/31) Jan 05 2012 mizer

Iain Buclaw (22/43) Jan 05 2012 ay
Manu (6/50) Jan 05 2012 So regarding my assumptions about translating the D front end expression...
Iain Buclaw (6/11) Jan 05 2012 I think you are confusing the pure with memoization. I could be wrong

Peter Alexander (16/27) Jan 05 2012 I think Manu is right:

bearophile (9/25) Jan 05 2012 If bar is pure, but it throws exceptions, the two versions of the code b...

Peter Alexander (17/28) Jan 05 2012 I think Manu is right:

Iain Buclaw (19/54) Jan 05 2012 ns

Manu (3/16) Jan 05 2012 Umm, maybe... but I don't think so.
Iain Buclaw (8/26) Jan 05 2012 What I meant was, pure attribute will only allow the compiler to

Walter Bright (4/8) Jan 05 2012 Manu, I appreciate your expertise in this manner, which I lack. I think ...

Manu (3/13) Jan 05 2012 Love to. I'll give it some thorough thought. There's more details than I

Walter Bright (3/5) Jan 05 2012 I know the devil is in the details :-)

a (1/7) Jan 05 2012 You forgot about AVX. It uses 256 bit registers and is supported in new ...

Manu (3/12) Jan 05 2012 AVX is another type, a new 256bit type, as double is to float, and shoul...

Andrei Alexandrescu (9/20) Jan 04 2012 Hmmm, I see that the other way around. D CTFE-generated macros are much

Vladimir Panteleev (18/21) Jan 04 2012 Paging Don Clugston: would it be feasable to have the compiler

Rainer Schuetze (5/23) Jan 04 2012 A simpler way to debug CTFE generated code is to dump it to another file...
Martin Nowak (6/24) Jan 09 2012 Last time I generated big and complex mixin I let the compiler
Manu (3/33) Jan 09 2012 Amazing idea, this should be standard!

Manu (9/30) Jan 04 2012 It's rare to step through optimised code. You tend to debug and step in

Jerry (8/14) Jan 05 2012 I do it all the time. Normally I with optimized builds because our code

Artur Skawina (5/10) Jan 04 2012 User defined attributes.
Artur Skawina (11/14) Jan 04 2012 Some of these things are *already* in GDC... Probably not documented and...

Jacob Carlborg (5/10) Jan 04 2012 If you want your code to be portable (between compilers) you would need

Artur Skawina (6/17) Jan 05 2012 Exactly. Which isn't a problem if you have one or two such functions. Bu...

Jacob Carlborg (7/24) Jan 05 2012 The pragma is a standard way to expose non-standard features. You just

Vladimir Panteleev (3/7) Jan 05 2012 DMD has an -ignore switch:

Jacob Carlborg (4/11) Jan 06 2012 I had no idea.

Manu (8/122) Jan 04 2012 Oh, and virtual-by-default... completely unacceptable for a systems

Walter Bright (12/17) Jan 04 2012 The only reason to use classes in D is for polymorphic behavior - and th...

bearophile (5/7) Jan 04 2012 I don't agree, in some cases I use final class instances instead of heap...

Walter Bright (4/11) Jan 04 2012 There's no reason to avoid pointer syntax, because D has:
Jacob Carlborg (4/11) Jan 04 2012 You can get that with a static opCall for structs too.

Manu (26/49) Jan 04 2012 You just missed a big discussion on IRC about this, where I think I made

Timon Gehr (5/33) Jan 04 2012 No.

Manu (5/54) Jan 04 2012 Here's one I'm working on right now (C++).

Timon Gehr (3/51) Jan 04 2012 9/~65 approx 1:6.

Walter Bright (15/25) Jan 04 2012 struct S { ... }

bearophile (23/25) Jan 04 2012 There is something left that I'd like to see D care more about, method h...

Jesse Phillips (4/18) Jan 04 2012 Should we just disallow this? If the function wasn't static it

bearophile (7/20) Jan 05 2012 Sometimes it's an useful idiom, and probably some D code in the wild is ...

Jesse Phillips (5/10) Jan 05 2012 I guess the question I was getting at, currently there is no way

bearophile (5/9) Jan 05 2012 As far as I know no other changes are planned linked to the introduction...

Manu (20/34) Jan 05 2012 In 15 years I have never once overridden a non-virtual function, assumin...

Walter Bright (16/37) Jan 05 2012 Yes, I have. With a complex inheritance hierarchy, I was doing the optim...

Manu (17/27) Jan 05 2012 I saw today, or last night, someone suggesting a keyword to make

Walter Bright (15/25) Jan 05 2012 Sure.

Manu (14/23) Jan 05 2012 But this solution is equally correct, and doesn't make any sacrifice for

Sean Kelly (6/9) Jan 05 2012 will compile it anyway, and most of the time it will work. That's what =

Vladimir Panteleev (2/12) Jan 05 2012 C# uses "new" and Delphi uses "reintroduce".

Sean Kelly (12/14) Jan 04 2012 to the point where it performs too poorly on some architecture; lets say...

Walter Bright (3/5) Jan 04 2012 I agree. Such is a massive failure in designing a polymorphic type, and ...

Manu (8/15) Jan 05 2012 I don't follow.. how is someone failing (or forgetting) to type 'final' ...

Walter Bright (7/20) Jan 05 2012 In any class design, one must decide which functions are overrideable an...

Timon Gehr (2/9) Jan 04 2012 If you don't need virtual functions don't use classes.

Manu (4/16) Jan 04 2012 Polymorphism isn't the only difference by a long shot. Allocation and

Iain Buclaw (34/79) Jan 04 2012 te
Andrew Wiley (6/13) Jan 05 2012 The only benchmark of virtual functions on ARM that I can find is
Artur Skawina (3/15) Jan 05 2012 "C" inline is, for historical reasons, ill-defined; i think what people ...
Manu (3/11) Jan 05 2012 How is this possible, when all functions are virtual, without whole prog...

Walter Bright (4/6) Jan 05 2012 Only public non-final class member functions are virtual. Furthermore, a...

Manu (15/23) Jan 05 2012 Can you define 'sometimes'? I have trouble believing that this will occu...

Walter Bright (5/14) Jan 05 2012 In C++, it does it if it's a.foo() rather than pa->foo(). In D, it can b...

Andrej Mitrovic (6/6) Jan 05 2012 About your junior comment, are virtuals really the biggest thing you
Andrej Mitrovic (5/5) Jan 05 2012 Btw, I think people are having a misconception on what it means that D
Manu (16/23) Jan 05 2012 Sure it's not the biggest thing, it's one of numerous things. You'll not...

Walter Bright (2/5) Jan 05 2012 I do appreciate your effort in making these issues known to us.

Jonathan M Davis (6/8) Jan 02 2012 Well, if you want to discuss string theory...
Mattbeui (4/5) Jan 05 2012 I thought this topic was about a mix of Go (Google Language) and

Walter Bright <newshound2 digitalmars.com> writes:

http://pastebin.com/AtuzJqh0

Dec 29 2011

Max Samukha <maxsamukha gmail.com> writes:

On 12/29/2011 11:16 AM, Walter Bright wrote:
 http://pastebin.com/AtuzJqh0

He will soon realize that he wants an earthborn language rather than the 
one of God :)

Dec 29 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 12/29/2011 1:32 AM, Max Samukha wrote:
 On 12/29/2011 11:16 AM, Walter Bright wrote:
 http://pastebin.com/AtuzJqh0

 He will soon realize that he wants an earthborn language rather than the one of
 God :)

Watch out, or you may attract a thunderbolt!!

Dec 29 2011

Caligo <iteronvexor gmail.com> writes:

On Thu, Dec 29, 2011 at 3:16 AM, Walter Bright
<newshound2 digitalmars.com>wrote:

 http://pastebin.com/AtuzJqh0

This is somewhat of a serious question:  If there is a God (I'm not saying
there isn't, and I'm not saying there is), what language would he choose to
create the universe?  It would be hard for us mortals to imagine, but would
it resemble a functional programming language more or something else?  And
what type of hardware would the code run on?  I mean, there are
computations happening all around us, e.g., when an apple falls or planets
circle the sun, etc, so what's performing all the computation?

Dec 29 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Thursday, 29 December 2011 at 10:16:03 UTC, Caligo wrote:
 On Thu, Dec 29, 2011 at 3:16 AM, Walter Bright
 <newshound2 digitalmars.com>wrote:

 http://pastebin.com/AtuzJqh0

 This is somewhat of a serious question:  If there is a God (I'm 
 not saying there isn't, and I'm not saying there is), what 
 language would he choose to create the universe?  It would be 
 hard for us mortals to imagine, but would it resemble a 
 functional programming language more or something else?  And 
 what type of hardware would the code run on?  I mean, there are
 computations happening all around us, e.g., when an apple falls 
 or planets circle the sun, etc, so what's performing all the 
 computation?

Obligatory XKCD: http://xkcd.com/224/

Dec 29 2011

Gour <gour atmarama.net> writes:

On Thu, 29 Dec 2011 04:15:27 -0600
Caligo <iteronvexor gmail.com> wrote:

 This is somewhat of a serious question:  If there is a God (I'm not
 saying there isn't, and I'm not saying there is),=20

There is. ;)

 It would be hard for us mortals to imagine, but would it resemble a
 functional programming language more or something else? =20

Just answer the following question: Are we mortals the result of pure
function or just side-effect?


Sincerely,
Gour

--=20
There are principles to regulate attachment and aversion pertaining to=20
the senses and their objects. One should not come under the control of=20
such attachment and aversion, because they are stumbling blocks on the=20
path of self-realization.

http://atmarama.net | Hlapicina (Croatia) | GPG: 52B5C810

Dec 29 2011

Caligo <iteronvexor gmail.com> writes:

On Thu, Dec 29, 2011 at 4:40 AM, Gour <gour atmarama.net> wrote:

 Just answer the following question: Are we mortals the result of pure
 function or just side-effect?

You are asking about creationism and evolution, aren't you?  I have to say
that I don't know.

Always trust the one who is looking for the truth, not the one who has
found it. :-)

Dec 29 2011

maarten van damme <maartenvd1994 gmail.com> writes:

I think it would be an object oriented language, I'm a believer in the
string theory :)
I have actually thought of the whole universe as one big simulation, would
really explain how light waves without medium (like a math function).

If I were god I would def use object oriented because it makes for easy
describing of different particles and strings. and I'm pretty sure there is
no garbage collector included in gods language :p

Dec 29 2011

"Nick Sabalausky" <a a.a> writes:

"maarten van damme" <maartenvd1994 gmail.com> wrote in message 
news:mailman.1985.1325157846.24802.digitalmars-d puremagic.com...
I think it would be an object oriented language, I'm a believer in the
 string theory :)

I heard on the Science Channel that M-theory was becoming favored over 
string therory. (Not that I would actually know.)

 I have actually thought of the whole universe as one big simulation, would
 really explain how light waves without medium (like a math function).

I came across a book one time that talked about the 'verse basically being 
one big quantum computer. I didn't actually red through it though, and I 
can't remember what it was called... :(

 If I were god I would def use object oriented because it makes for easy
 describing of different particles and strings. and I'm pretty sure there 
 is
 no garbage collector included in gods language :p

If I were god, then I'd presumably be omnipotent, and if I were omnipotent, 
then I'd be able to do it all in something like FuckFuck, or that 
shakesperian language, or that lolcat language without any difficulty. And I 
could just fix any limitations in the implementation. So that would seem the 
best option :)

Jan 02 2012

Timon Gehr <timon.gehr gmx.ch> writes:

On 01/02/2012 09:00 PM, Nick Sabalausky wrote:
 "maarten van damme"<maartenvd1994 gmail.com>  wrote in message
 news:mailman.1985.1325157846.24802.digitalmars-d puremagic.com...
 I think it would be an object oriented language, I'm a believer in the
 string theory :)

 I heard on the Science Channel that M-theory was becoming favored over
 string therory. (Not that I would actually know.)

 I have actually thought of the whole universe as one big simulation, would
 really explain how light waves without medium (like a math function).

 I came across a book one time that talked about the 'verse basically being
 one big quantum computer. I didn't actually red through it though, and I
 can't remember what it was called... :(

 If I were god I would def use object oriented because it makes for easy
 describing of different particles and strings. and I'm pretty sure there
 is
 no garbage collector included in gods language :p

 If I were god, then I'd presumably be omnipotent, and if I were omnipotent,
 then I'd be able to do it all in something like FuckFuck, or that
 shakesperian language, or that lolcat language without any difficulty. And I
 could just fix any limitations in the implementation. So that would seem the
 best option :)

God cannot be omnipotent. If he was, he could invent a task he cannot solve.

Jan 02 2012

Caligo <iteronvexor gmail.com> writes:

On Mon, Jan 2, 2012 at 4:29 PM, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 01/02/2012 09:00 PM, Nick Sabalausky wrote:

 "maarten van damme"<maartenvd1994 gmail.com**>  wrote in message
 news:mailman.1985.1325157846.**24802.digitalmars-d puremagic.**com...

 I think it would be an object oriented language, I'm a believer in the
 string theory :)

 I heard on the Science Channel that M-theory was becoming favored over
 string therory. (Not that I would actually know.)

  I have actually thought of the whole universe as one big simulation,
 would
 really explain how light waves without medium (like a math function).

 I came across a book one time that talked about the 'verse basically being
 one big quantum computer. I didn't actually red through it though, and I
 can't remember what it was called... :(

  If I were god I would def use object oriented because it makes for easy
 describing of different particles and strings. and I'm pretty sure there
 is
 no garbage collector included in gods language :p

 If I were god, then I'd presumably be omnipotent, and if I were
 omnipotent,
 then I'd be able to do it all in something like FuckFuck, or that
 shakesperian language, or that lolcat language without any difficulty.
 And I
 could just fix any limitations in the implementation. So that would seem
 the
 best option :)

 God cannot be omnipotent. If he was, he could invent a task he cannot
 solve.

He has; the human race.

Jan 02 2012

Gour <gour atmarama.net> writes:

On Mon, 02 Jan 2012 23:29:17 +0100
Timon Gehr <timon.gehr gmx.ch> wrote:

 God cannot be omnipotent. If he was, he could invent a task he cannot
 solve.

Wrong. He is not static, but dynamic, so He can invent a task he cannot
solve, but in the next moment he can solve it. ;)


Sincerely,
Gour


--=20
When your intelligence has passed out of the dense forest=20
of delusion, you shall become indifferent to all that has=20
been heard and all that is to be heard.

http://atmarama.net | Hlapicina (Croatia) | GPG: 52B5C810

Jan 02 2012

Timon Gehr <timon.gehr gmx.ch> writes:

On 01/03/2012 08:26 AM, Gour wrote:
 On Mon, 02 Jan 2012 23:29:17 +0100
 Timon Gehr<timon.gehr gmx.ch>  wrote:

 God cannot be omnipotent. If he was, he could invent a task he cannot
 solve.

 Wrong. He is not static, but dynamic, so He can invent a task he cannot
 solve, but in the next moment he can solve it. ;)


 Sincerely,
 Gour

I meant he can invent a task he will never be able to solve. ;)

Jan 02 2012

Gour <gour atmarama.net> writes:

On Tue, 03 Jan 2012 08:31:33 +0100
Timon Gehr <timon.gehr gmx.ch> wrote:

 I meant he can invent a task he will never be able to solve. ;)

Nah...those are just side-effects, iow. noise. :-D


Sincerely,
Gour


--=20
But those who, out of envy, disregard these teachings and do not=20
follow them are to be considered bereft of all knowledge, befooled,=20
and ruined in their endeavors for perfection.

http://atmarama.net | Hlapicina (Croatia) | GPG: 52B5C810

Jan 02 2012

"Nick Sabalausky" <a a.a> writes:

"Timon Gehr" <timon.gehr gmx.ch> wrote in message 
news:jduasl$ndh$1 digitalmars.com...
 On 01/03/2012 08:26 AM, Gour wrote:
 On Mon, 02 Jan 2012 23:29:17 +0100
 Timon Gehr<timon.gehr gmx.ch>  wrote:

 God cannot be omnipotent. If he was, he could invent a task he cannot
 solve.

 Wrong. He is not static, but dynamic, so He can invent a task he cannot
 solve, but in the next moment he can solve it. ;)


 Sincerely,
 Gour

 I meant he can invent a task he will never be able to solve. ;)

I've never felt that argument to be particularly compelling: I see it as 
merely indicating that an omnipotent being is able to give up their own 
omnipotence. Which, being omnipotent, they'd of course have to be capable of 
doing.

Of course, you could then try "Could he create a task he couldn't solve 
without giving up his own omnipotence?" But I think amounts to a logical 
contradiction akin to any other such as "Could an omnipotent being make a 
rock that isn't a rock?" And that's a whole other philosophical matter (ie, 
Do logical contradictions count as something an omnipotent being must be 
able to do?).

Jan 03 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 1/3/2012 12:48 AM, Nick Sabalausky wrote:
 "Could an omnipotent being make a
 rock that isn't a rock?"

I don't know, but I'm sure he could make a product that is both a floor wax and 
a dessert topping.

Jan 03 2012

"Nick Sabalausky" <a a.a> writes:

"Walter Bright" <newshound2 digitalmars.com> wrote in message 
news:jdvgnr$2uer$1 digitalmars.com...
 On 1/3/2012 12:48 AM, Nick Sabalausky wrote:
 "Could an omnipotent being make a
 rock that isn't a rock?"

 I don't know, but I'm sure he could make a product that is both a floor 
 wax and a dessert topping.

I'm having visions of Billy Mays...

Jan 03 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 1/3/2012 10:25 AM, Nick Sabalausky wrote:
 "Walter Bright"<newshound2 digitalmars.com>  wrote in message
 news:jdvgnr$2uer$1 digitalmars.com...
 On 1/3/2012 12:48 AM, Nick Sabalausky wrote:
 "Could an omnipotent being make a
 rock that isn't a rock?"

 I don't know, but I'm sure he could make a product that is both a floor
 wax and a dessert topping.

 I'm having visions of Billy Mays...


Wrong reference! Google "floor wax and dessert topping".

Jan 03 2012

J Arrizza <cppgent0 gmail.com> writes:

  and I'm pretty sure there is  no garbage collector included in gods
 language :p

Are you sure? There is good evidence he strongly prefers gc's. Consider
almost all insects; consider dung beetles specifically. Consider super
novas, gravity and accretion disks. Consider Disney and the Circle of Life.
It's pretty clear he views automated recycling as a general architectural
approach.

A large benefit of a gc is it disassociates responsibility for cleanup from
the creator of the object. Now imagine the opposite: after you died, you
were responsible for disassembling yourself for use by others to create
themselves (think "Soylent Green, The Next Generation"). And if you didn't
do it, or you didn't do it properly, the world would eventually overcrowd
and explode, leaving a core dump in space. Nice.

Of course, he'd give himself a switch to turn off the gc when he really
needed to.

John

Jan 02 2012

maarten van damme <maartenvd1994 gmail.com> writes:

2012/1/3 J Arrizza <cppgent0 gmail.com>

 Are you sure? There is good evidence he strongly prefers gc's. Consider
 almost all insects; consider dung beetles specifically. Consider super
 novas, gravity and accretion disks. Consider Disney and the Circle of Life.
 It's pretty clear he views automated recycling as a general architectural
 approach.


 A large benefit of a gc is it disassociates responsibility for cleanup
 from the creator of the object. Now imagine the opposite: after you died,
 you were responsible for disassembling yourself for use by others to create
 themselves (think "Soylent Green, The Next Generation"). And if you didn't
 do it, or you didn't do it properly, the world would eventually overcrowd
 and explode, leaving a core dump in space. Nice.


 Of course, he'd give himself a switch to turn off the gc when he really
 needed to.

there is no destruction/creation going on, energy is constant at all times
in a closed system. That's how I thought about it :)
If it's constant anyway he wouldn't have to bother with a gc, would he?

I meant he can invent a task he will never be able to solve. ;)

this seems rather strange doesn't it?
If something is able to do everything, he should be able to invent
something he is not able to do. if he invented something he is not able to
do, he can't do everything.
One could therefore assume it is not possible to be able to do everything :D

Well, if you want to discuss string theory...

http://xkcd.com/171/
http://xkcd.com/397/

:)

great one, I really like the first one. It's really the essence of string
theory in a way :)

Jan 03 2012

J Arrizza <cppgent0 gmail.com> writes:

On Tue, Jan 3, 2012 at 2:36 AM, maarten van damme
<maartenvd1994 gmail.com>wrote:

 there is no destruction/creation going on, energy is constant at all times
 in a closed system. That's how I thought about it :)
 If it's constant anyway he wouldn't have to bother with a gc, would he?

I see. Something like "Matter is neither created nor destroyed...".

But similarly memory is neither created nor destroyed. Unless of course
you're talking about a  god language that can create hardware at run-time:

// make sure the power supply can handle the extra memory
this.PowerSupply.currentCurrent()++;

// ... don't forget extra bypass capacitance
// and check the wiring just in case.
Capacitor mycap =  new Capacitor(0.47uF);
this.PowerSupply.BypassCap.Add(mycap);
assert(this.PowerSupply.PositiveRail..capacity > 2.1A);
assert(this.PowerSupply.NegativeRail..capacity > 2.1A);


// finally! Add the extra storage we need
this.SDRAM.extend(1GB);


I meant he can invent a task he will never be able to solve. ;)
 this seems rather strange doesn't it?
 If something is able to do everything, he should be able to invent
 something he is not able to do. if he invented something he is not able to
 do, he can't do everything.
 One could therefore assume it is not possible to be able to do everything
 :D

Can an omnipotent being bypass logical syllogisms? Don't forget: *ALL*
powerful means not just the physical stuff.

If so, then your argument doesn't hold... or it does.  More precisely, it
holds and doesn't hold at the same time, until you open the box and
Schrodinger's  cat jumps out. Or doesn't.

John

Jan 03 2012

Jacob Carlborg <doob me.com> writes:

On 2011-12-29 11:15, Caligo wrote:
 On Thu, Dec 29, 2011 at 3:16 AM, Walter Bright
 <newshound2 digitalmars.com <mailto:newshound2 digitalmars.com>> wrote:

     http://pastebin.com/AtuzJqh0


 This is somewhat of a serious question:  If there is a God (I'm not
 saying there isn't, and I'm not saying there is), what language would he
 choose to create the universe?  It would be hard for us mortals to
 imagine, but would it resemble a functional programming language more or
 something else?  And what type of hardware would the code run on?  I
 mean, there are computations happening all around us, e.g., when an
 apple falls or planets circle the sun, etc, so what's performing all the
 computation?

Servers in the cloud of course :)

-- 
/Jacob Carlborg

Dec 29 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 12/29/11 4:15 AM, Caligo wrote:
 On Thu, Dec 29, 2011 at 3:16 AM, Walter Bright
 <newshound2 digitalmars.com <mailto:newshound2 digitalmars.com>> wrote:

     http://pastebin.com/AtuzJqh0


 This is somewhat of a serious question:  If there is a God (I'm not
 saying there isn't, and I'm not saying there is), what language would he
 choose to create the universe?  It would be hard for us mortals to
 imagine, but would it resemble a functional programming language more or
 something else?  And what type of hardware would the code run on?  I
 mean, there are computations happening all around us, e.g., when an
 apple falls or planets circle the sun, etc, so what's performing all the
 computation?

Obligatory: http://xkcd.com/224/

Andrei

Dec 29 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 12/29/2011 2:15 AM, Caligo wrote:
 If there is a God (I'm not saying there
 isn't, and I'm not saying there is), what language would he choose to create
the
 universe?

Mathematics.

Dec 29 2011

so <so so.so> writes:

On Thu, 29 Dec 2011 20:27:43 +0200, Walter Bright  
<newshound2 digitalmars.com> wrote:

 On 12/29/2011 2:15 AM, Caligo wrote:
 If there is a God (I'm not saying there
 isn't, and I'm not saying there is), what language would he choose to  
 create the
 universe?

 Mathematics.

In the essence and the spirit math is THE answer,
but if you mean the implementation we have now, it is verbose nonsense.

Yet we are talking about GODs language, you have a point!
Only an immortal could comprehend math fully.

Dec 29 2011

FeepingCreature <default_357-line yahoo.de> writes:

On 12/29/11 19:27, Walter Bright wrote:
 On 12/29/2011 2:15 AM, Caligo wrote:
 If there is a God (I'm not saying there
 isn't, and I'm not saying there is), what language would he choose to create
the
 universe?

 
 Mathematics.

Fan of Tegmark�, eh? :)

--

�http://en.wikipedia.org/wiki/Mathematical_universe_hypothesis

Dec 29 2011

=?utf-8?Q?Simen_Kj=C3=A6r=C3=A5s?= <simen.kjaras gmail.com> writes:

On Thu, 29 Dec 2011 21:08:29 +0100, FeepingCreature  =

<default_357-line yahoo.de> wrote:

 On 12/29/11 19:27, Walter Bright wrote:
 On 12/29/2011 2:15 AM, Caligo wrote:
 If there is a God (I'm not saying there
 isn't, and I'm not saying there is), what language would he choose t=



o  =

 create the
 universe?

 Mathematics.

 Fan of Tegmark=C2=B9, eh? :)

 --

 =C2=B9http://en.wikipedia.org/wiki/Mathematical_universe_hypothesis

I love that one. My favorite is that it indicates the existence of a  =

boolean
universe. I like to believe it is currently 'off'.

Jan 02 2012

Don <nospam nospam.com> writes:

On 29.12.2011 11:15, Caligo wrote:
 On Thu, Dec 29, 2011 at 3:16 AM, Walter Bright
 <newshound2 digitalmars.com <mailto:newshound2 digitalmars.com>> wrote:

     http://pastebin.com/AtuzJqh0


 This is somewhat of a serious question:  If there is a God (I'm not
 saying there isn't, and I'm not saying there is), what language would he
 choose to create the universe?  It would be hard for us mortals to
 imagine, but would it resemble a functional programming language more or
 something else?  And what type of hardware would the code run on?  I
 mean, there are computations happening all around us, e.g., when an
 apple falls or planets circle the sun, etc, so what's performing all the
 computation?

Declarative.
Program begins with void.
Let there be <thing>.

Dec 29 2011

bcs <bcs example.com> writes:

On 12/29/2011 02:15 AM, Caligo wrote:
 This is somewhat of a serious question:  If there is a God (I'm not
 saying there isn't, and I'm not saying there is), what language would he
 choose to create the universe?  It would be hard for us mortals to
 imagine, but would it resemble a functional programming language more or
 something else?  And what type of hardware would the code run on?  I
 mean, there are computations happening all around us, e.g., when an
 apple falls or planets circle the sun, etc, so what's performing all the
 computation?

I have two contradictory answers:

Languages, Prolog.
Hardware, something that can solve the hauling problem (but just for for 
turning machines).

Jan 04 2012

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Thursday, 29 December 2011 at 09:16:23 UTC, Walter Bright 
wrote:
 Are you a ridiculous hacker? Inline x86 assembly that the 
 compiler actually understands in 32 AND 64 bit code, hex string 
 literals like x"DE ADB EEF" where spacing doesn't matter, the 
 ability to set data alignment cross-platform with type.alignof 
 = 16, load your shellcode verbatim into a string like so: auto 
 str = import("shellcode.txt");

I would like to talk about this for a bit. Personally, I think 
D's system programming abilities are only half-way there. Note 
that I am not talking about use cases in high-level application 
code, but rather low-level, widely-used framework code, where 
every bit of performance matters (for example: memory copy 
routines, string builders, garbage collectors).

In-line assembler as part of the language is certainly neat, and 
in fact coming from Delphi to C++ I was surprised to learn that 
C++ implementations adopted different syntax for asm blocks. 
However, compared to some C++ compilers, it has severe 
limitations and is D's only trick in this alley.

For one thing, there is no way to force the compiler to inline a 
function (like __forceinline / __attribute((always_inline)) ). 
This is fine for high-level code (where users are best left with 
PGO and "the compiler knows best"), but sucks if you need a 
guarantee that the function must be inlined. The guarantee isn't 
just about inlining heuristics, but also implementation 
capabilities. For example, some implementations might not be able 
to inline functions that use certain language features, and your 
code's performance could demand that such a short function must 
be inlined. One example of this is inlining functions containing 
asm blocks - IIRC DMD does not support this. The compiler should 
fail the build if it can't inline a function tagged with 
 forceinline, instead of shrugging it off and failing silently, 
forcing users to check the disassembly every time.

You may have noticed that GCC has some ridiculously complicated 
assembler facilities. However, they also open the way to the 
possibilities of writing optimal code - for example, creating 
custom calling conventions, or inlining assembler functions 
without restricting the caller's register allocation with a 
predetermined calling convention. In contrast, DMD is very 
conservative when it comes to mixing D and assembler. One time I 
found that putting an asm block in a function turned what were 
single instructions into blocks of 6 instructions each.

D's lacking in this area makes it impossible to create language 
features that are on the level of D's compiler built-ins. For 
example, I have tested three memcpy implementations recently, but 
none of them could beat DMD's standard array slice copy (despite 
that in release mode it compiles to a simple memcpy call). Why? 
Because the overhead of using a custom memcpy routine negated its 
performance gains.

This might have been alleviated with the presence of sane macros, 
but no such luck. String mixins are not the answer: trying to 
translate macro-heavy C code to D using string mixins is string 
escape hell, and we're back to the level of shell scripts.

We've discussed this topic on IRC recently. From what I 
understood, Andrei thinks improvements in this area are not 
"impactful" enough, which I find worrisome.

Personally, I don't think D qualifies as a true "system 
programming language" in light of the above. It's more of a 
compiled language with pointers and assembler. Before you 
disagree with any of the above, first (for starters) I'd like to 
invite you to translate Daniel Vik's C memcpy implementation to 
D: http://www.danielvik.com/2010/02/fast-memcpy-in-c.html . It 
doesn't even use inline assembler or compiler intrinsics.

Dec 29 2011

=?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <xtzgzorex gmail.com> writes:

On 29-12-2011 12:19, Vladimir Panteleev wrote:
 On Thursday, 29 December 2011 at 09:16:23 UTC, Walter Bright wrote:
 Are you a ridiculous hacker? Inline x86 assembly that the compiler
 actually understands in 32 AND 64 bit code, hex string literals like
 x"DE ADB EEF" where spacing doesn't matter, the ability to set data
 alignment cross-platform with type.alignof = 16, load your shellcode
 verbatim into a string like so: auto str = import("shellcode.txt");

 I would like to talk about this for a bit. Personally, I think D's
 system programming abilities are only half-way there. Note that I am not
 talking about use cases in high-level application code, but rather
 low-level, widely-used framework code, where every bit of performance
 matters (for example: memory copy routines, string builders, garbage
 collectors).

 In-line assembler as part of the language is certainly neat, and in fact
 coming from Delphi to C++ I was surprised to learn that C++
 implementations adopted different syntax for asm blocks. However,
 compared to some C++ compilers, it has severe limitations and is D's
 only trick in this alley.

 For one thing, there is no way to force the compiler to inline a
 function (like __forceinline / __attribute((always_inline)) ). This is
 fine for high-level code (where users are best left with PGO and "the
 compiler knows best"), but sucks if you need a guarantee that the
 function must be inlined. The guarantee isn't just about inlining
 heuristics, but also implementation capabilities. For example, some
 implementations might not be able to inline functions that use certain
 language features, and your code's performance could demand that such a
 short function must be inlined. One example of this is inlining
 functions containing asm blocks - IIRC DMD does not support this. The
 compiler should fail the build if it can't inline a function tagged with
  forceinline, instead of shrugging it off and failing silently, forcing
 users to check the disassembly every time.

 You may have noticed that GCC has some ridiculously complicated
 assembler facilities. However, they also open the way to the
 possibilities of writing optimal code - for example, creating custom
 calling conventions, or inlining assembler functions without restricting
 the caller's register allocation with a predetermined calling
 convention. In contrast, DMD is very conservative when it comes to
 mixing D and assembler. One time I found that putting an asm block in a
 function turned what were single instructions into blocks of 6
 instructions each.

 D's lacking in this area makes it impossible to create language features
 that are on the level of D's compiler built-ins. For example, I have
 tested three memcpy implementations recently, but none of them could
 beat DMD's standard array slice copy (despite that in release mode it
 compiles to a simple memcpy call). Why? Because the overhead of using a
 custom memcpy routine negated its performance gains.

 This might have been alleviated with the presence of sane macros, but no
 such luck. String mixins are not the answer: trying to translate
 macro-heavy C code to D using string mixins is string escape hell, and
 we're back to the level of shell scripts.

 We've discussed this topic on IRC recently. From what I understood,
 Andrei thinks improvements in this area are not "impactful" enough,
 which I find worrisome.

 Personally, I don't think D qualifies as a true "system programming
 language" in light of the above. It's more of a compiled language with
 pointers and assembler. Before you disagree with any of the above, first
 (for starters) I'd like to invite you to translate Daniel Vik's C memcpy
 implementation to D:
 http://www.danielvik.com/2010/02/fast-memcpy-in-c.html . It doesn't even
 use inline assembler or compiler intrinsics.

+1. D needs a way to force inlining. The compiler can, at best, do 
heuristics. If D wants to cater to systems programmers -- that is, 
programmers who *know their shit* -- it needs advanced features like 
this. Same reason we have __gshared, for example.

- Alex

Dec 29 2011

so <so so.so> writes:

On Thu, 29 Dec 2011 13:44:12 +0200, Alex R=C3=B8nne Petersen  =

<xtzgzorex gmail.com> wrote:

 +1. D needs a way to force inlining. The compiler can, at best, do  =

 heuristics. If D wants to cater to systems programmers -- that is,  =

 programmers who *know their shit* -- it needs advanced features like  =

 this. Same reason we have __gshared, for example.

 - Alex

The legitimate "D performs so bad in my example" posts appeared in this =
 =

forum
almost always ended up with the conclusion that D's lack a controlled  =

inline mechanism.

Dec 29 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 12/29/2011 9:15 AM, so wrote:
 The legitimate "D performs so bad in my example" posts appeared in this forum
 almost always ended up with the conclusion that D's lack a controlled inline
 mechanism.

Standard C doesn't have one either. C vendors often implement vendor-specific 
extensions for this.

Dec 29 2011

Kapps <Kapps NotValidEmail.com> writes:

Agreed.

There are plenty of real-world, even 'common' examples where the lack of 
being able to force inlining for a function is a problem. The main one 
I've run into is not being able to inline functions with assembly, thus 
not being able to implement efficient SIMD operations.

Dec 29 2011

a <a a.com> writes:

Kapps Wrote:

 Agreed.
 
 There are plenty of real-world, even 'common' examples where the lack of 
 being able to force inlining for a function is a problem. The main one 
 I've run into is not being able to inline functions with assembly, thus 
 not being able to implement efficient SIMD operations.

The problem is not just inlining but also needless loads and stores at the
beginnings and ends of asm blocks. For example in the following code:

void test(ref V a, ref V b)
{
    asm
    {
        movaps XMM0, a;
        addps  XMM0, b;
        movaps a, XMM0;
    }
    asm
    {
        movaps XMM0, a;
        addps  XMM0, b;
        movaps a, XMM0;
    }
}

compiles to:


   0:   55                      push   %rbp
   1:   48 8b ec                mov    %rsp,%rbp
   4:   48 83 ec 10             sub    $0x10,%rsp
   8:   48 89 7d f0             mov    %rdi,-0x10(%rbp)
   c:   48 89 75 f8             mov    %rsi,-0x8(%rbp)
  10:   0f 28 45 f8             movaps -0x8(%rbp),%xmm0
  14:   0f 58 45 f0             addps  -0x10(%rbp),%xmm0
  18:   0f 29 45 f8             movaps %xmm0,-0x8(%rbp)
  1c:   0f 28 45 f8             movaps -0x8(%rbp),%xmm0
  20:   0f 58 45 f0             addps  -0x10(%rbp),%xmm0
  24:   0f 29 45 f8             movaps %xmm0,-0x8(%rbp)
  28:   48 8b e5                mov    %rbp,%rsp
  2b:   5d                      pop    %rbp
  2c:   c3                      retq   

The needles loads and stores would make it impossible to write an efficient
simd add function even if the functions containing asm blocks could be inlined.

Dec 29 2011

David Nadlinger <see klickverbot.at> writes:

On 12/29/11 2:13 PM, a wrote:
 void test(ref V a, ref V b)
 {
      asm
      {
          movaps XMM0, a;
          addps  XMM0, b;
          movaps a, XMM0;
      }
      asm
      {
          movaps XMM0, a;
          addps  XMM0, b;
          movaps a, XMM0;
      }
 }

 […]

 The needles loads and stores would make it impossible to write an efficient
simd add function even if the functions containing asm blocks could be inlined.

Yes, this is indeed a problem, and as far as I'm aware, usually solved 
in the gamedev world by using the (SSE) intrinsics your favorite C++ 
compiler provides, instead of resorting to inline asm.

David

Dec 29 2011

Paulo Pinto <pjmlp progtools.org> writes:

Specially because some 64 bit compilers are providing intrinsics as the only
way to access the processor.

Visual C++ for example, does not provide inline assembly support.

David Nadlinger Wrote:

 On 12/29/11 2:13 PM, a wrote:
 void test(ref V a, ref V b)
 {
      asm
      {
          movaps XMM0, a;
          addps  XMM0, b;
          movaps a, XMM0;
      }
      asm
      {
          movaps XMM0, a;
          addps  XMM0, b;
          movaps a, XMM0;
      }
 }

 […]

 The needles loads and stores would make it impossible to write an efficient
simd add function even if the functions containing asm blocks could be inlined.

 
 Yes, this is indeed a problem, and as far as I'm aware, usually solved 
 in the gamedev world by using the (SSE) intrinsics your favorite C++ 
 compiler provides, instead of resorting to inline asm.
 
 David

Dec 29 2011

a <a a.com> writes:

David Nadlinger Wrote:

 On 12/29/11 2:13 PM, a wrote:
 void test(ref V a, ref V b)
 {
      asm
      {
          movaps XMM0, a;
          addps  XMM0, b;
          movaps a, XMM0;
      }
      asm
      {
          movaps XMM0, a;
          addps  XMM0, b;
          movaps a, XMM0;
      }
 }

 […]

 The needles loads and stores would make it impossible to write an efficient
simd add function even if the functions containing asm blocks could be inlined.

 
 Yes, this is indeed a problem, and as far as I'm aware, usually solved 
 in the gamedev world by using the (SSE) intrinsics your favorite C++ 
 compiler provides, instead of resorting to inline asm.
 
 David

IIRC Walter doesn't want to add vector intrinsics, so it would be nice if the
functions to do vector operations could be efficiently  written using inline
assembly.  It would also be a more general solution than having intrinsics.
Something like that is possible with gcc extended inline assembly. For example
this: 

typedef float v4sf __attribute__((vector_size(16)));

void vadd(v4sf *a, v4sf *b)
{
    asm(
        "addps %1, %0" 
        : "=x" (*a) 
        : "x" (*b), "0" (*a)
        : );
}

void test(float * __restrict__ a, float * __restrict__ b)
{
    v4sf * va = (v4sf*) a;
    v4sf * vb = (v4sf*) b;
    vadd(va,vb);
    vadd(va,vb);
    vadd(va,vb);
    vadd(va,vb);
}

compiles to:

00000000004004c0 <test>:
  4004c0:       0f 28 0e                movaps (%rsi),%xmm1
  4004c3:       0f 28 07                movaps (%rdi),%xmm0
  4004c6:       0f 58 c1                addps  %xmm1,%xmm0
  4004c9:       0f 58 c1                addps  %xmm1,%xmm0
  4004cc:       0f 58 c1                addps  %xmm1,%xmm0
  4004cf:       0f 58 c1                addps  %xmm1,%xmm0
  4004d2:       0f 29 07                movaps %xmm0,(%rdi)

This should also be possible with GDC, but I couldn't figure out how to get
something like __restrict__ (if you want to use vector types and gcc extended
inline assembly with GDC, see
http://www.digitalmars.com/d/archives/D/gnu/Support_for_gcc_vector_attributes_SIM
_builtins_3778.html and
https://bitbucket.org/goshawk/gdc/wiki/UserDocumentation).

Dec 29 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 12/29/2011 5:13 AM, a wrote:
 The needles loads and stores would make it impossible to write an efficient
 simd add function even if the functions containing asm blocks could be
 inlined.

This does what you're asking for:

void test(ref float a, ref float b)
{
     asm
     {
         naked;
         movaps  XMM0,[RSI];
         addps   XMM0,[RDI];
         movaps  [RSI],XMM0;
         movaps  XMM0,[RSI];
         addps   XMM0,[RDI];
         movaps  [RSI],XMM0;
         ret;
     }
}

Dec 29 2011

a <a a.com> writes:

Walter Bright Wrote:

 On 12/29/2011 5:13 AM, a wrote:
 The needles loads and stores would make it impossible to write an efficient
 simd add function even if the functions containing asm blocks could be
 inlined.

 
 This does what you're asking for:
 
 void test(ref float a, ref float b)
 {
      asm
      {
          naked;
          movaps  XMM0,[RSI];
          addps   XMM0,[RDI];
          movaps  [RSI],XMM0;
          movaps  XMM0,[RSI];
          addps   XMM0,[RDI];
          movaps  [RSI],XMM0;
          ret;
      }
 }

What I want is to be able to write short functions using inline assembly and
have them inlined and compiled even to a single instruction where possible.
This can be done with gcc. See my post here:
http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=153879

Dec 29 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 12/29/2011 2:52 PM, a wrote:
 What I want is to be able to write short functions using inline assembly and
 have them inlined and compiled even to a single instruction where possible.
 This can be done with gcc. See my post here:
 http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=153879

I understand. I just wished to make sure you knew about 'naked' and what good
it 
was for.

Dec 29 2011

Peter Alexander <peter.alexander.au gmail.com> writes:

On 29/12/11 11:19 AM, Vladimir Panteleev wrote:
 On Thursday, 29 December 2011 at 09:16:23 UTC, Walter Bright wrote:
 Are you a ridiculous hacker? Inline x86 assembly that the compiler
 actually understands in 32 AND 64 bit code, hex string literals like
 x"DE ADB EEF" where spacing doesn't matter, the ability to set data
 alignment cross-platform with type.alignof = 16, load your shellcode
 verbatim into a string like so: auto str = import("shellcode.txt");

 I would like to talk about this for a bit. Personally, I think D's
 system programming abilities are only half-way there. Note that I am not
 talking about use cases in high-level application code, but rather
 low-level, widely-used framework code, where every bit of performance
 matters (for example: memory copy routines, string builders, garbage
 collectors).

 In-line assembler as part of the language is certainly neat, and in fact
 coming from Delphi to C++ I was surprised to learn that C++
 implementations adopted different syntax for asm blocks. However,
 compared to some C++ compilers, it has severe limitations and is D's
 only trick in this alley.

 For one thing, there is no way to force the compiler to inline a
 function (like __forceinline / __attribute((always_inline)) ). This is
 fine for high-level code (where users are best left with PGO and "the
 compiler knows best"), but sucks if you need a guarantee that the
 function must be inlined. The guarantee isn't just about inlining
 heuristics, but also implementation capabilities. For example, some
 implementations might not be able to inline functions that use certain
 language features, and your code's performance could demand that such a
 short function must be inlined. One example of this is inlining
 functions containing asm blocks - IIRC DMD does not support this. The
 compiler should fail the build if it can't inline a function tagged with
  forceinline, instead of shrugging it off and failing silently, forcing
 users to check the disassembly every time.

 You may have noticed that GCC has some ridiculously complicated
 assembler facilities. However, they also open the way to the
 possibilities of writing optimal code - for example, creating custom
 calling conventions, or inlining assembler functions without restricting
 the caller's register allocation with a predetermined calling
 convention. In contrast, DMD is very conservative when it comes to
 mixing D and assembler. One time I found that putting an asm block in a
 function turned what were single instructions into blocks of 6
 instructions each.

 D's lacking in this area makes it impossible to create language features
 that are on the level of D's compiler built-ins. For example, I have
 tested three memcpy implementations recently, but none of them could
 beat DMD's standard array slice copy (despite that in release mode it
 compiles to a simple memcpy call). Why? Because the overhead of using a
 custom memcpy routine negated its performance gains.

 This might have been alleviated with the presence of sane macros, but no
 such luck. String mixins are not the answer: trying to translate
 macro-heavy C code to D using string mixins is string escape hell, and
 we're back to the level of shell scripts.

 We've discussed this topic on IRC recently. From what I understood,
 Andrei thinks improvements in this area are not "impactful" enough,
 which I find worrisome.

 Personally, I don't think D qualifies as a true "system programming
 language" in light of the above. It's more of a compiled language with
 pointers and assembler. Before you disagree with any of the above, first
 (for starters) I'd like to invite you to translate Daniel Vik's C memcpy
 implementation to D:
 http://www.danielvik.com/2010/02/fast-memcpy-in-c.html . It doesn't even
 use inline assembler or compiler intrinsics.

+1

Also: vector instrinsics.

Also: alignment specifications (not just member variables).

The lack of both these things is currently causing me much pain :-( 
Manually aligning things gets tiresome after a while.

Dec 29 2011

bearophile <bearophileHUGS lycos.com> writes:

Vladimir Panteleev:

 One example of this is inlining functions containing 
 asm blocks - IIRC DMD does not support this. The compiler should 
 fail the build if it can't inline a function tagged with 
  forceinline, instead of shrugging it off and failing silently, 
 forcing users to check the disassembly every time.

Right.


 You may have noticed that GCC has some ridiculously complicated 
 assembler facilities. However, they also open the way to the 
 possibilities of writing optimal code - for example, creating 
 custom calling conventions, or inlining assembler functions 
 without restricting the caller's register allocation with a 
 predetermined calling convention. In contrast, DMD is very 
 conservative when it comes to mixing D and assembler. One time I 
 found that putting an asm block in a function turned what were 
 single instructions into blocks of 6 instructions each.

LDC has a mean to inline functions with asm, and asm expressions. DMD too
should have both. I am saying this since two or three years.

Bye,
bearophile

Dec 29 2011

Don <nospam nospam.com> writes:

On 29.12.2011 12:19, Vladimir Panteleev wrote:
 On Thursday, 29 December 2011 at 09:16:23 UTC, Walter Bright wrote:
 Are you a ridiculous hacker? Inline x86 assembly that the compiler
 actually understands in 32 AND 64 bit code, hex string literals like
 x"DE ADB EEF" where spacing doesn't matter, the ability to set data
 alignment cross-platform with type.alignof = 16, load your shellcode
 verbatim into a string like so: auto str = import("shellcode.txt");

 I would like to talk about this for a bit. Personally, I think D's
 system programming abilities are only half-way there. Note that I am not
 talking about use cases in high-level application code, but rather
 low-level, widely-used framework code, where every bit of performance
 matters (for example: memory copy routines, string builders, garbage
 collectors).

 In-line assembler as part of the language is certainly neat, and in fact
 coming from Delphi to C++ I was surprised to learn that C++
 implementations adopted different syntax for asm blocks. However,
 compared to some C++ compilers, it has severe limitations and is D's
 only trick in this alley.

 For one thing, there is no way to force the compiler to inline a
 function (like __forceinline / __attribute((always_inline)) ).

[snip]
 Personally, I don't think D qualifies as a true "system programming
 language" in light of the above. It's more of a compiled language with
 pointers and assembler.

I don't think the situation is any different with DMC. I think that if D 
isn't a systems programming lanugage, neither is C or C++ without 
vendor-specific extensions.

But it doesn't really matter -- the main conclusion is still correct: D 
is missing some features which could improve performance considerably.

Before you disagree with any of the above, first
 (for starters) I'd like to invite you to translate Daniel Vik's C memcpy
 implementation to D:

 http://www.danielvik.com/2010/02/fast-memcpy-in-c.html . It doesn't even
 use inline assembler or compiler intrinsics.

Note that the memcpy described there is _far_ from optimal. Memcpy is 
all about cache effciency. DMD translates memcpy to the single 
instruction "rep movsd" which you'd think would be optimal, but you can 
actually beat it by a factor of four or more for long lengths.

Dec 29 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Thursday, 29 December 2011 at 14:44:45 UTC, Don wrote:
 http://www.danielvik.com/2010/02/fast-memcpy-in-c.html . It 
 doesn't even
 use inline assembler or compiler intrinsics.

 Note that the memcpy described there is _far_ from optimal. 
 Memcpy is all about cache effciency. DMD translates memcpy to 
 the single instruction "rep movsd" which you'd think would be 
 optimal, but you can actually beat it by a factor of four or 
 more for long lengths.

I've never seen DMD emit rep movsd. Does rep movsd even make 
sense when the memory areas do not have the same alignment? 
memcpy in snn.lib has a rep movsd instruction, but there's lots 
of other code (including what looks like Duff's device).

Dec 29 2011

Don <nospam nospam.com> writes:

On 29.12.2011 16:07, Vladimir Panteleev wrote:
 On Thursday, 29 December 2011 at 14:44:45 UTC, Don wrote:
 http://www.danielvik.com/2010/02/fast-memcpy-in-c.html . It doesn't even
 use inline assembler or compiler intrinsics.

 Note that the memcpy described there is _far_ from optimal. Memcpy is
 all about cache effciency. DMD translates memcpy to the single
 instruction "rep movsd" which you'd think would be optimal, but you
 can actually beat it by a factor of four or more for long lengths.

 I've never seen DMD emit rep movsd. Does rep movsd even make sense when
 the memory areas do not have the same alignment? memcpy in snn.lib has a
 rep movsd instruction, but there's lots of other code (including what
 looks like Duff's device).

It's in the backend in cod2.c, line 3260. But on closer inspection -- 
you're right! It's in an
if(0 && ...) block.
So it never does it, even when everything's aligned.

There's a _huge_ potential for improvement in that function.

Dec 29 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Thursday, 29 December 2011 at 14:44:45 UTC, Don wrote:
 I don't think the situation is any different with DMC. I think 
 that if D isn't a systems programming lanugage, neither is C or 
 C++ without vendor-specific extensions.

You're right... I've never extensively used a C/C++ compiler 
without similar extensions, though. The fact that major vendors 
come up with their own extensions to do many of the same features 
shows that they might have better been standardized.

Dec 29 2011

so <so so.so> writes:

On Thu, 29 Dec 2011 17:20:22 +0200, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Thursday, 29 December 2011 at 14:44:45 UTC, Don wrote:
 I don't think the situation is any different with DMC. I think that if  
 D isn't a systems programming lanugage, neither is C or C++ without  
 vendor-specific extensions.

 You're right... I've never extensively used a C/C++ compiler without  
 similar extensions, though. The fact that major vendors come up with  
 their own extensions to do many of the same features shows that they  
 might have better been standardized.

Well i remember at most one or two supported me when i brought it up and  
Walter dismissed instantly.

Dec 29 2011

bearophile <bearophileHUGS lycos.com> writes:

Vladimir Panteleev:

 The fact that major vendors 
 come up with their own extensions to do many of the same features 
 shows that they might have better been standardized.

Right. (This is why once I have asked for a explicitly not implemented computed
gotos, to have them in D standard despite DMD doesn't implement them (LDC/GDC
are probably able implement them quickly)).

On the other hand D2 already makes standard several of the non-standard
features of GNU C.

Bye,
bearophile

Dec 29 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Thursday, 29 December 2011 at 14:44:45 UTC, Don wrote:
 I don't think the situation is any different with DMC. I think 
 that if D isn't a systems programming lanugage, neither is C or 
 C++ without vendor-specific extensions.

C macros are a crude form of inlining. String mixins do not scale 
well in the same way as C macros (e.g. in the way they're used in 
said memcpy implementation).

Dec 29 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 12/29/2011 3:19 AM, Vladimir Panteleev wrote:
 I'd like to invite you to translate Daniel Vik's C memcpy implementation to D:
 http://www.danielvik.com/2010/02/fast-memcpy-in-c.html

Challenge accepted.
------------------------
/********************************************************************
  ** File:     memcpy.c
  **
  ** Copyright (C) 1999-2010 Daniel Vik
  **
  ** This software is provided 'as-is', without any express or implied
  ** warranty. In no event will the authors be held liable for any
  ** damages arising from the use of this software.
  ** Permission is granted to anyone to use this software for any
  ** purpose, including commercial applications, and to alter it and
  ** redistribute it freely, subject to the following restrictions:
  **
  ** 1. The origin of this software must not be misrepresented; you
  **    must not claim that you wrote the original software. If you
  **    use this software in a product, an acknowledgment in the
  **    use this software in a product, an acknowledgment in the
  **    product documentation would be appreciated but is not
  **    required.
  **
  ** 2. Altered source versions must be plainly marked as such, and
  **    must not be misrepresented as being the original software.
  **
  ** 3. This notice may not be removed or altered from any source
  **    distribution.
  **
  **
  ** Description: Implementation of the standard library function memcpy.
  **             This implementation of memcpy() is ANSI-C89 compatible.
  **
  **             The following configuration options can be set:
  **
  **           LITTLE_ENDIAN   - Uses processor with little endian
  **                             addressing. Default is big endian.
  **
  **           PRE_INC_PTRS    - Use pre increment of pointers.
  **                             Default is post increment of
  **                             pointers.
  **
  **           INDEXED_COPY    - Copying data using array indexing.
  **                             Using this option, disables the
  **                             PRE_INC_PTRS option.
  **
  **           MEMCPY_64BIT    - Compiles memcpy for 64 bit
  **                             architectures
  **
  **
  ** Best Settings:
  **
  ** Intel x86:  LITTLE_ENDIAN and INDEXED_COPY
  **
  *******************************************************************/

module memcpy;


/********************************************************************
  ** Configuration definitions.
  *******************************************************************/

version = LITTLE_ENDIAN;
version = INDEXED_COPY;


/********************************************************************
  ** Includes for size_t definition
  *******************************************************************/



/********************************************************************
  ** Typedefs
  *******************************************************************/

alias ubyte       UInt8;
alias ushort      UInt16;
alias uint        UInt32;
alias ulong       UInt64;

version (D_LP64)
{
     alias UInt64   UIntN;
     enum TYPE_WIDTH = 8;
}
else
{
     alias UInt32 UIntN;
     enum TYPE_WIDTH = 4;
}


/********************************************************************
  ** Remove definitions when INDEXED_COPY is defined.
  *******************************************************************/

//#if defined (INDEXED_COPY)
//#if defined (PRE_INC_PTRS)
//#undef PRE_INC_PTRS
//#endif /*PRE_INC_PTRS*/
//#endif /*INDEXED_COPY*/



/********************************************************************
  ** Definitions for pre and post increment of pointers.
  *******************************************************************/

version (PRE_INC_PTRS)
{
     void START_VAL(ref UInt8* x)      { x--; }
     ref T INC_VAL(T)(ref T* x)        { return *++x; }
     UInt8* CAST_TO_U8(void* p, int o) { return cast(UInt8*)p + o + TYPE_WIDTH;
}
     enum WHILE_DEST_BREAK  = (TYPE_WIDTH - 1);
     enum PRE_LOOP_ADJUST   = -(TYPE_WIDTH - 1);
     enum PRE_SWITCH_ADJUST = 1;
}
else
{
     void START_VAL(UInt8* x)	      { }
     ref T INC_VAL(T)(ref T* x)        { return *x++; }
     UInt8* CAST_TO_U8(void* p, int o) { return cast(UInt8*)p + o; }
     enum WHILE_DEST_BREAK  = 0;
     enum PRE_LOOP_ADJUST   = 0;
     enum PRE_SWITCH_ADJUST = 0;
}







/********************************************************************
  **
  ** void *memcpy(void *dest, const void *src, size_t count)
  **
  ** Args:     dest        - pointer to destination buffer
  **           src         - pointer to source buffer
  **           count       - number of bytes to copy
  **
  ** Return:   A pointer to destination buffer
  **
  ** Purpose:  Copies count bytes from src to dest.
  **           No overlap check is performed.
  **
  *******************************************************************/

void *memcpy(void *dest, const void *src, size_t count)
{
     auto dst8 = cast(UInt8*)dest;
     auto src8 = cast(UInt8*)src;

     UIntN* dstN;
     UIntN* srcN;
     UIntN dstWord;
     UIntN srcWord;

     /********************************************************************
      ** Macros for copying words of  different alignment.
      ** Uses incremening pointers.
      *******************************************************************/

     void CP_INCR() {
	INC_VAL(dstN) = INC_VAL(srcN);
     }

     void CP_INCR_SH(int shl, int shr) {
	version (LITTLE_ENDIAN)
	{
	    dstWord   = srcWord >> shl;
	    srcWord   = INC_VAL(srcN);
	    dstWord  |= srcWord << shr;
	    INC_VAL(dstN) = dstWord;
	}
	else
	{
	    dstWord   = srcWord << shl;
	    srcWord   = INC_VAL(srcN);
	    dstWord  |= srcWord >> shr;
	    INC_VAL(dstN) = dstWord;
	}
     }



     /********************************************************************
      ** Macros for copying words of  different alignment.
      ** Uses array indexes.
      *******************************************************************/

     void CP_INDEX(size_t idx) {
	dstN[idx] = srcN[idx];
     }

     void CP_INDEX_SH(size_t x, int shl, int shr) {
	version (LITTLE_ENDIAN)
	{
	    dstWord   = srcWord >> shl;
	    srcWord   = srcN[x];
	    dstWord  |= srcWord << shr;
	    dstN[x]  = dstWord;
	}
	else
	{
	    dstWord   = srcWord << shl;
	    srcWord   = srcN[x];
	    dstWord  |= srcWord >> shr;
	    dstN[x]  = dstWord;
	}
     }


     /********************************************************************
      ** Macros for copying words of different alignment.
      ** Uses incremening pointers or array indexes depending on
      ** configuration.
      *******************************************************************/

     version (INDEXED_COPY)
     {
	void CP(size_t idx) { CP_INDEX(idx); }
	void CP_SH(size_t idx, int shl, int shr) { CP_INDEX_SH(idx, shl, shr); }

	void INC_INDEX(T)(ref T* p, size_t o) { p += o; }
     }
     else
     {
	void CP(size_t idx) { CP_INCR(); }
	void CP_SH(size_t idx, int shl, int shr) { CP_INCR_SH(shl, shr); }

	void INC_INDEX(T)(T* p, size_t o) { }
     }


     void COPY_REMAINING(size_t count) {
	START_VAL(dst8);
	START_VAL(src8);

	switch (count) {
	case 7: INC_VAL(dst8) = INC_VAL(src8);
	case 6: INC_VAL(dst8) = INC_VAL(src8);
	case 5: INC_VAL(dst8) = INC_VAL(src8);
	case 4: INC_VAL(dst8) = INC_VAL(src8);
	case 3: INC_VAL(dst8) = INC_VAL(src8);
	case 2: INC_VAL(dst8) = INC_VAL(src8);
	case 1: INC_VAL(dst8) = INC_VAL(src8);
	case 0:
	default: break;
	}
     }

     void COPY_NO_SHIFT() {
	dstN = cast(UIntN*)(dst8 + PRE_LOOP_ADJUST);
	srcN = cast(UIntN*)(src8 + PRE_LOOP_ADJUST);
	size_t length = count / TYPE_WIDTH;

	while (length & 7) {
	    CP_INCR();
	    length--;
	}

	length /= 8;

	while (length--) {
	    CP(0);
	    CP(1);
	    CP(2);
	    CP(3);
	    CP(4);
	    CP(5);
	    CP(6);
	    CP(7);

	    INC_INDEX(dstN, 8);
	    INC_INDEX(srcN, 8);
	}

	src8 = CAST_TO_U8(srcN, 0);
	dst8 = CAST_TO_U8(dstN, 0);

	COPY_REMAINING(count & (TYPE_WIDTH - 1));
     }


     void COPY_SHIFT(int shift) {
	dstN  = cast(UIntN*)(((cast(UIntN)dst8) + PRE_LOOP_ADJUST) &
				 ~(TYPE_WIDTH - 1));
	srcN  = cast(UIntN*)(((cast(UIntN)src8) + PRE_LOOP_ADJUST) &
				 ~(TYPE_WIDTH - 1));
	size_t length  = count / TYPE_WIDTH;
	srcWord = INC_VAL(srcN);

	while (length & 7) {
	    CP_INCR_SH(8 * shift, 8 * (TYPE_WIDTH - shift));
	    length--;
	}

	length /= 8;

	while (length--) {
	    CP_SH(0, 8 * shift, 8 * (TYPE_WIDTH - shift));
	    CP_SH(1, 8 * shift, 8 * (TYPE_WIDTH - shift));
	    CP_SH(2, 8 * shift, 8 * (TYPE_WIDTH - shift));
	    CP_SH(3, 8 * shift, 8 * (TYPE_WIDTH - shift));
	    CP_SH(4, 8 * shift, 8 * (TYPE_WIDTH - shift));
	    CP_SH(5, 8 * shift, 8 * (TYPE_WIDTH - shift));
	    CP_SH(6, 8 * shift, 8 * (TYPE_WIDTH - shift));
	    CP_SH(7, 8 * shift, 8 * (TYPE_WIDTH - shift));

	    INC_INDEX(dstN, 8);
	    INC_INDEX(srcN, 8);
	}

	src8 = CAST_TO_U8(srcN, (shift - TYPE_WIDTH));
	dst8 = CAST_TO_U8(dstN, 0);

	COPY_REMAINING(count & (TYPE_WIDTH - 1));
     }


     if (count < 8) {
         COPY_REMAINING(count);
         return dest;
     }

     START_VAL(dst8);
     START_VAL(src8);

     while ((cast(UIntN)dst8 & (TYPE_WIDTH - 1)) != WHILE_DEST_BREAK) {
         INC_VAL(dst8) = INC_VAL(src8);
         count--;
     }

     final switch (((cast(UIntN)src8) + PRE_SWITCH_ADJUST) & (TYPE_WIDTH - 1)) {
     case 0: COPY_NO_SHIFT(); break;
     case 1: COPY_SHIFT(1);   break;
     case 2: COPY_SHIFT(2);   break;
     case 3: COPY_SHIFT(3);   break;
     static if (TYPE_WIDTH >= 4)
     {
	case 4: COPY_SHIFT(4);   break;
	case 5: COPY_SHIFT(5);   break;
	case 6: COPY_SHIFT(6);   break;
	case 7: COPY_SHIFT(7);   break;
     }
     }

     return dest;
}

Dec 29 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 12/29/11 1:47 PM, Walter Bright wrote:
 On 12/29/2011 3:19 AM, Vladimir Panteleev wrote:
 I'd like to invite you to translate Daniel Vik's C memcpy
 implementation to D:
 http://www.danielvik.com/2010/02/fast-memcpy-in-c.html

 Challenge accepted.

[snip]

Benchmarks?

Andrei

Dec 29 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 12/29/2011 11:47 AM, Walter Bright wrote:
 On 12/29/2011 3:19 AM, Vladimir Panteleev wrote:
 I'd like to invite you to translate Daniel Vik's C memcpy implementation to D:
 http://www.danielvik.com/2010/02/fast-memcpy-in-c.html

 Challenge accepted.


This does compile, though I did not test or benchmark it.

Examining the assembler output, it inlines everything except COPY_SHIFT, 
COPY_NO_SHIFT, and COPY_REMAINING. The inliner in dmd could definitely be 
improved, but that is not a problem with the language, but the implementation.

Continuing in that vein, please note that neither C nor C++ require inlining of 
any sort. The "inline" keyword is merely a hint to the compiler. What inlining 
takes place is completely implementation defined, not language defined.

The same goes for all those language extensions you mentioned. Those are not 
part of Standard C. They are vendor extensions. Does that mean that C is not 
actually a systems language? No.

I wish to note that the D version semantically accomplishes the same thing as 
the C version without using mixins or CTFE - it's all straightforward code, 
without the abusive preprocessor tricks.

Dec 29 2011

so <so so.so> writes:

On Thu, 29 Dec 2011 22:00:12 +0200, Walter Bright  
<newshound2 digitalmars.com> wrote:

 Examining the assembler output, it inlines everything except COPY_SHIFT,  
 COPY_NO_SHIFT, and COPY_REMAINING. The inliner in dmd could definitely  
 be improved, but that is not a problem with the language, but the  
 implementation.

 Continuing in that vein, please note that neither C nor C++ require  
 inlining of any sort. The "inline" keyword is merely a hint to the  
 compiler. What inlining takes place is completely implementation  
 defined, not language defined.

 The same goes for all those language extensions you mentioned. Those are  
 not part of Standard C. They are vendor extensions. Does that mean that  
 C is not actually a systems language? No.

 I wish to note that the D version semantically accomplishes the same  
 thing as the C version without using mixins or CTFE - it's all  
 straightforward code, without the abusive preprocessor tricks.

Yet every big C/C++ compiler has to support it, no?
Lets forget D for a second.
Will you, as a compiler vendor support controlled inline in DMD with an  
extension?
Or let me try another way, will you "let" community to do it?

Dec 29 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 12/29/2011 12:23 PM, so wrote:
 Yet every big C/C++ compiler has to support it, no?
 Lets forget D for a second.
 Will you, as a compiler vendor support controlled inline in DMD with an
extension?
 Or let me try another way, will you "let" community to do it?

You can do a pull request for it, and we can evaluate it.

Dec 29 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 12/29/2011 11:47 AM, Walter Bright wrote:
 On 12/29/2011 3:19 AM, Vladimir Panteleev wrote:
 I'd like to invite you to translate Daniel Vik's C memcpy implementation to D:
 http://www.danielvik.com/2010/02/fast-memcpy-in-c.html

 Challenge accepted.

Here's another version that uses string mixins to ensure inlining of the COPY 
functions. There are no call instructions in the generated code. This should be 
as good as the C version using the same code generator.
----------------
/********************************************************************
  ** File:     memcpy.c
  **
  ** Copyright (C) 1999-2010 Daniel Vik
  **
  ** This software is provided 'as-is', without any express or implied
  ** warranty. In no event will the authors be held liable for any
  ** damages arising from the use of this software.
  ** Permission is granted to anyone to use this software for any
  ** purpose, including commercial applications, and to alter it and
  ** redistribute it freely, subject to the following restrictions:
  **
  ** 1. The origin of this software must not be misrepresented; you
  **    must not claim that you wrote the original software. If you
  **    use this software in a product, an acknowledgment in the
  **    use this software in a product, an acknowledgment in the
  **    product documentation would be appreciated but is not
  **    required.
  **
  ** 2. Altered source versions must be plainly marked as such, and
  **    must not be misrepresented as being the original software.
  **
  ** 3. This notice may not be removed or altered from any source
  **    distribution.
  **
  **
  ** Description: Implementation of the standard library function memcpy.
  **             This implementation of memcpy() is ANSI-C89 compatible.
  **
  **             The following configuration options can be set:
  **
  **           LITTLE_ENDIAN   - Uses processor with little endian
  **                             addressing. Default is big endian.
  **
  **           PRE_INC_PTRS    - Use pre increment of pointers.
  **                             Default is post increment of
  **                             pointers.
  **
  **           INDEXED_COPY    - Copying data using array indexing.
  **                             Using this option, disables the
  **                             PRE_INC_PTRS option.
  **
  **           MEMCPY_64BIT    - Compiles memcpy for 64 bit
  **                             architectures
  **
  **
  ** Best Settings:
  **
  ** Intel x86:  LITTLE_ENDIAN and INDEXED_COPY
  **
  *******************************************************************/

module memcpy;


/********************************************************************
  ** Configuration definitions.
  *******************************************************************/

version = LITTLE_ENDIAN;
version = INDEXED_COPY;


/********************************************************************
  ** Includes for size_t definition
  *******************************************************************/



/********************************************************************
  ** Typedefs
  *******************************************************************/

alias ubyte       UInt8;
alias ushort      UInt16;
alias uint        UInt32;
alias ulong       UInt64;

version (D_LP64)
{
     alias UInt64   UIntN;
     enum TYPE_WIDTH = 8;
}
else
{
     alias UInt32 UIntN;
     enum TYPE_WIDTH = 4;
}


/********************************************************************
  ** Remove definitions when INDEXED_COPY is defined.
  *******************************************************************/

//#if defined (INDEXED_COPY)
//#if defined (PRE_INC_PTRS)
//#undef PRE_INC_PTRS
//#endif /*PRE_INC_PTRS*/
//#endif /*INDEXED_COPY*/



/********************************************************************
  ** Definitions for pre and post increment of pointers.
  *******************************************************************/

version (PRE_INC_PTRS)
{
     void START_VAL(ref UInt8* x)      { x--; }
     ref T INC_VAL(T)(ref T* x)        { return *++x; }
     UInt8* CAST_TO_U8(void* p, int o) { return cast(UInt8*)p + o + TYPE_WIDTH;
}
     enum WHILE_DEST_BREAK  = (TYPE_WIDTH - 1);
     enum PRE_LOOP_ADJUST   = -(TYPE_WIDTH - 1);
     enum PRE_SWITCH_ADJUST = 1;
}
else
{
     void START_VAL(UInt8* x)	      { }
     ref T INC_VAL(T)(ref T* x)        { return *x++; }
     UInt8* CAST_TO_U8(void* p, int o) { return cast(UInt8*)p + o; }
     enum WHILE_DEST_BREAK  = 0;
     enum PRE_LOOP_ADJUST   = 0;
     enum PRE_SWITCH_ADJUST = 0;
}







/********************************************************************
  **
  ** void *memcpy(void *dest, const void *src, size_t count)
  **
  ** Args:     dest        - pointer to destination buffer
  **           src         - pointer to source buffer
  **           count       - number of bytes to copy
  **
  ** Return:   A pointer to destination buffer
  **
  ** Purpose:  Copies count bytes from src to dest.
  **           No overlap check is performed.
  **
  *******************************************************************/

void *memcpy(void *dest, const void *src, size_t count)
{
     auto dst8 = cast(UInt8*)dest;
     auto src8 = cast(UInt8*)src;

     UIntN* dstN;
     UIntN* srcN;
     UIntN dstWord;
     UIntN srcWord;

     /********************************************************************
      ** Macros for copying words of  different alignment.
      ** Uses incremening pointers.
      *******************************************************************/

     void CP_INCR() {
	INC_VAL(dstN) = INC_VAL(srcN);
     }

     void CP_INCR_SH(int shl, int shr) {
	version (LITTLE_ENDIAN)
	{
	    dstWord   = srcWord >> shl;
	    srcWord   = INC_VAL(srcN);
	    dstWord  |= srcWord << shr;
	    INC_VAL(dstN) = dstWord;
	}
	else
	{
	    dstWord   = srcWord << shl;
	    srcWord   = INC_VAL(srcN);
	    dstWord  |= srcWord >> shr;
	    INC_VAL(dstN) = dstWord;
	}
     }



     /********************************************************************
      ** Macros for copying words of  different alignment.
      ** Uses array indexes.
      *******************************************************************/

     void CP_INDEX(size_t idx) {
	dstN[idx] = srcN[idx];
     }

     void CP_INDEX_SH(size_t x, int shl, int shr) {
	version (LITTLE_ENDIAN)
	{
	    dstWord   = srcWord >> shl;
	    srcWord   = srcN[x];
	    dstWord  |= srcWord << shr;
	    dstN[x]  = dstWord;
	}
	else
	{
	    dstWord   = srcWord << shl;
	    srcWord   = srcN[x];
	    dstWord  |= srcWord >> shr;
	    dstN[x]  = dstWord;
	}
     }


     /********************************************************************
      ** Macros for copying words of different alignment.
      ** Uses incremening pointers or array indexes depending on
      ** configuration.
      *******************************************************************/

     version (INDEXED_COPY)
     {
	void CP(size_t idx) { CP_INDEX(idx); }
	void CP_SH(size_t idx, int shl, int shr) { CP_INDEX_SH(idx, shl, shr); }

	void INC_INDEX(T)(ref T* p, size_t o) { p += o; }
     }
     else
     {
	void CP(size_t idx) { CP_INCR(); }
	void CP_SH(size_t idx, int shl, int shr) { CP_INCR_SH(shl, shr); }

	void INC_INDEX(T)(T* p, size_t o) { }
     }

     static immutable string COPY_REMAINING = q{
	START_VAL(dst8);
	START_VAL(src8);

	switch (cnt) {
	case 7: INC_VAL(dst8) = INC_VAL(src8);
	case 6: INC_VAL(dst8) = INC_VAL(src8);
	case 5: INC_VAL(dst8) = INC_VAL(src8);
	case 4: INC_VAL(dst8) = INC_VAL(src8);
	case 3: INC_VAL(dst8) = INC_VAL(src8);
	case 2: INC_VAL(dst8) = INC_VAL(src8);
	case 1: INC_VAL(dst8) = INC_VAL(src8);
	case 0:
	default: break;
	}
     };

     static immutable string COPY_NO_SHIFT = q{
	dstN = cast(UIntN*)(dst8 + PRE_LOOP_ADJUST);
	srcN = cast(UIntN*)(src8 + PRE_LOOP_ADJUST);
	size_t length = count / TYPE_WIDTH;

	while (length & 7) {
	    CP_INCR();
	    length--;
	}

	length /= 8;

	while (length--) {
	    CP(0);
	    CP(1);
	    CP(2);
	    CP(3);
	    CP(4);
	    CP(5);
	    CP(6);
	    CP(7);

	    INC_INDEX(dstN, 8);
	    INC_INDEX(srcN, 8);
	}

	src8 = CAST_TO_U8(srcN, 0);
	dst8 = CAST_TO_U8(dstN, 0);

	{ const cnt = (count & (TYPE_WIDTH - 1)); mixin(COPY_REMAINING); }
     };


     static immutable string COPY_SHIFT = q{
	dstN  = cast(UIntN*)(((cast(UIntN)dst8) + PRE_LOOP_ADJUST) &
				 ~(TYPE_WIDTH - 1));
	srcN  = cast(UIntN*)(((cast(UIntN)src8) + PRE_LOOP_ADJUST) &
				 ~(TYPE_WIDTH - 1));
	size_t length  = count / TYPE_WIDTH;
	srcWord = INC_VAL(srcN);

	while (length & 7) {
	    CP_INCR_SH(8 * shift, 8 * (TYPE_WIDTH - shift));
	    length--;
	}

	length /= 8;

	while (length--) {
	    CP_SH(0, 8 * shift, 8 * (TYPE_WIDTH - shift));
	    CP_SH(1, 8 * shift, 8 * (TYPE_WIDTH - shift));
	    CP_SH(2, 8 * shift, 8 * (TYPE_WIDTH - shift));
	    CP_SH(3, 8 * shift, 8 * (TYPE_WIDTH - shift));
	    CP_SH(4, 8 * shift, 8 * (TYPE_WIDTH - shift));
	    CP_SH(5, 8 * shift, 8 * (TYPE_WIDTH - shift));
	    CP_SH(6, 8 * shift, 8 * (TYPE_WIDTH - shift));
	    CP_SH(7, 8 * shift, 8 * (TYPE_WIDTH - shift));

	    INC_INDEX(dstN, 8);
	    INC_INDEX(srcN, 8);
	}

	src8 = CAST_TO_U8(srcN, (shift - TYPE_WIDTH));
	dst8 = CAST_TO_U8(dstN, 0);

	{ const cnt = (count & (TYPE_WIDTH - 1)); mixin(COPY_REMAINING); }
     };


     if (count < 8) {
	const cnt = count;
         mixin(COPY_REMAINING);
         return dest;
     }

     START_VAL(dst8);
     START_VAL(src8);

     while ((cast(UIntN)dst8 & (TYPE_WIDTH - 1)) != WHILE_DEST_BREAK) {
         INC_VAL(dst8) = INC_VAL(src8);
         count--;
     }

     final switch (((cast(UIntN)src8) + PRE_SWITCH_ADJUST) & (TYPE_WIDTH - 1)) {
     case 0: mixin(COPY_NO_SHIFT); break;
     case 1: { const shift = 1; mixin(COPY_SHIFT); }   break;
     case 2: { const shift = 2; mixin(COPY_SHIFT); }   break;
     case 3: { const shift = 3; mixin(COPY_SHIFT); }   break;
     static if (TYPE_WIDTH >= 4)
     {
	case 4: { const shift = 4; mixin(COPY_SHIFT); }   break;
	case 5: { const shift = 5; mixin(COPY_SHIFT); }   break;
	case 6: { const shift = 6; mixin(COPY_SHIFT); }   break;
	case 7: { const shift = 7; mixin(COPY_SHIFT); }   break;
     }
     }

     return dest;
}

Dec 29 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 12/29/11 2:29 PM, Walter Bright wrote:
 On 12/29/2011 11:47 AM, Walter Bright wrote:
 On 12/29/2011 3:19 AM, Vladimir Panteleev wrote:
 I'd like to invite you to translate Daniel Vik's C memcpy
 implementation to D:
 http://www.danielvik.com/2010/02/fast-memcpy-in-c.html

 Challenge accepted.

 Here's another version that uses string mixins to ensure inlining of the
 COPY functions. There are no call instructions in the generated code.
 This should be as good as the C version using the same code generator.

[snip]

In other news, TAB has died with Kim-Jong Il. Please stop using it.

Andrei

Dec 29 2011

"Nick Sabalausky" <a a.a> writes:

"Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message 
news:jdilar$k66$1 digitalmars.com...
 On 12/29/11 2:29 PM, Walter Bright wrote:
 On 12/29/2011 11:47 AM, Walter Bright wrote:
 On 12/29/2011 3:19 AM, Vladimir Panteleev wrote:
 I'd like to invite you to translate Daniel Vik's C memcpy
 implementation to D:
 http://www.danielvik.com/2010/02/fast-memcpy-in-c.html

 Challenge accepted.

 Here's another version that uses string mixins to ensure inlining of the
 COPY functions. There are no call instructions in the generated code.
 This should be as good as the C version using the same code generator.

 [snip]

 In other news, TAB has died with Kim-Jong Il. Please stop using it.

Tab is indeed evil when certain people insist it should be size 8  ;)

Jan 02 2012

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Thursday, 29 December 2011 at 19:47:39 UTC, Walter Bright 
wrote:
 On 12/29/2011 3:19 AM, Vladimir Panteleev wrote:
 I'd like to invite you to translate Daniel Vik's C memcpy 
 implementation to D:
 http://www.danielvik.com/2010/02/fast-memcpy-in-c.html

 Challenge accepted.

Ah, a direct translation using functions! This is probably the 
most elegant approach, however - as I'm sure you've noticed - the 
programmer has no control over what gets inlined.

 Examining the assembler output, it inlines everything except 
 COPY_SHIFT, COPY_NO_SHIFT, and COPY_REMAINING. The inliner in 
 dmd could definitely be improved, but that is not a problem 
 with the language, but the implementation.

This is the problem with heuristic inlining: while great by 
itself, in a position such as this the programmer is left with no 
choice but to examine the assembler output to make sure the 
compiler does what the programmer wants it to do. Such behavior 
can change from one implementation to another, and even from one 
compiler version to another. (After all, I don't think that we 
can guarantee that what's inlined today, will be inlined 
tomorrow.)

 Continuing in that vein, please note that neither C nor C++ 
 require inlining of any sort. The "inline" keyword is merely a 
 hint to the compiler. What inlining takes place is completely 
 implementation defined, not language defined.

I think we can agree that the C inline hint is of limited use. 
However, major C compiler vendors implement an extension to force 
inlining. Generally, I would say that common vendor extensions 
seen in other languages are an opportunity for D to avoid a 
similar mess: such extensions would not have to be required to be 
implemented, but when they are, they would use the same syntax 
across implementations.

 I wish to note that the D version semantically accomplishes the 
 same thing as the C version without using mixins or CTFE - it's 
 all straightforward code, without the abusive preprocessor 
 tricks.

I don't think there's much value in that statement. After all, 
except for a few occasional templates (which weren't strictly 
necessary), your translation uses few D-specific features. If you 
were to leave yourself at the mercy of a C compiler's optimizer, 
your rewrite would merely be a testament against C macros, not 
the power of D.

However, the most important part is: this translation is 
incorrect. C macros in the original code provide a guarantee that 
the code is inlined. D cannot make such guarantees - even your 
amended version is tuned to one specific implementation (and 
possibly, only a specific range of versions of it).

Dec 29 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 12/29/2011 9:51 PM, Vladimir Panteleev wrote:
 Ah, a direct translation using functions! This is probably the most elegant
 approach, however - as I'm sure you've noticed - the programmer has no control
 over what gets inlined.

The programmer also has no control over which variables go into which
registers. 
(Early C compilers did provide this.)


 I think we can agree that the C inline hint is of limited use. However, major C
 compiler vendors implement an extension to force inlining.

I know.


 I don't think there's much value in that statement. After all, except for a few
 occasional templates (which weren't strictly necessary), your translation uses
 few D-specific features. If you were to leave yourself at the mercy of a C
 compiler's optimizer, your rewrite would merely be a testament against C
macros,
 not the power of D.

I think this criticism is off target, because the C example was almost entirely 
macros - and macros that were used in the service of evading C language 
limitations. The point wasn't to use clever D features, the challenge was to 
demonstrate you can get the same results in D as in C.


 However, the most important part is: this translation is incorrect. C macros in
 the original code provide a guarantee that the code is inlined. D cannot make
 such guarantees - even your amended version is tuned to one specific
 implementation (and possibly, only a specific range of versions of it).

I also think this is off target, because a C compiler really doesn't guarantee 
**** about efficiency, it only guarantees that it will work "as if" it was 
executed on some idealized abstract machine. Even dividing code up into 
functions is completely arbitrary, and open to wildly different strategies that 
are perfectly legal to any C compiler. A C compiler doesn't have to enregister 
anything in variables, either, and that has far more of a performance impact 
than inlining.

There are a very wide range of code generation techniques that compilers
employ. 
All of them, to verify that they are being applied, require inspection of the 
assembler output. Many argue that the compiler should tell you about inlining - 
but what about all those others? I think the focus on inlining (as opposed to 
other possible optimizations) is out of proportion, likely exacerbated by dmd 
needing to do a better job of it.

I completely agree that DMD's inliner is underpowered and needs improvement. I 
am less sure that this demonstrates that the language needs changes.

Functions below a certain size should be inlined if possible. Those above that 
size do not benefit perceptibly from inlining. Where that certain size exactly 
is, who knows, but I doubt that functions near that size will benefit much from 
user intervention.

Dec 29 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Friday, 30 December 2011 at 06:53:06 UTC, Walter Bright wrote:
 I think this criticism is off target, because the C example was 
 almost entirely macros - and macros that were used in the 
 service of evading C language limitations. The point wasn't to 
 use clever D features, the challenge was to demonstrate you can 
 get the same results in D as in C.

...

 I also think this is off target, because a C compiler really 
 doesn't guarantee **** about efficiency, it only guarantees 
 that it will work "as if" it was executed on some idealized 
 abstract machine. Even dividing code up into functions is 
 completely arbitrary, and open to wildly different strategies 
 that are perfectly legal to any C compiler. A C compiler 
 doesn't have to enregister anything in variables, either, and 
 that has far more of a performance impact than inlining.

Even though the core language (of C and D) are not specific to 
any one platform, writing fast code has never been about 
targeting abstract idealized virtual machines. Some assumptions 
need to be made. Most assumptions that the C memcpy code makes 
can be expected to generally be true across major C compilers 
(e.g. macros are at least as fast as regular functions). However, 
your D port makes some rather fragile assumptions regarding the 
compiler implementation.

Let's eliminate the language distinction, and consider two memcpy 
versions - one using macros, the other using functions (not even 
with "inline"). Would you say that the second is generally as 
fast as the first? I'm being intentionally vague: saying that 
their performance is "about the same" is holding on MUCH more 
fragile assumptions.

The fact that major compiler vendors implement language 
extensions to facilitate writing optimized code shows that there 
is a demand for it. Even compilers that are great at optimization 
(GCC, LLVM) have such intrinsics.

I'm not necessarily advocating changing the core language (e.g. 
new  attributes, things that would need to go into TDPLv2). 
However, what I think would greatly improve the situation is to 
have DigitalMars provide recommendations for 
implementation-specific extensions that provide more control with 
regards to how the code is compiled (pragma names, keywords 
starting with __, etc.). Once they're defined, pull requests to 
add them to DMD will follow.

 Functions below a certain size should be inlined if possible. 
 Those above that size do not benefit perceptibly from inlining. 
 Where that certain size exactly is, who knows, but I doubt that 
 functions near that size will benefit much from user 
 intervention.

I agree, but this wasn't as much about heuristics, but compiler 
capabilities (e.g. inlining assembler functions).

Dec 30 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 12/30/2011 12:16 AM, Vladimir Panteleev wrote:
 I agree, but this wasn't as much about heuristics, but compiler capabilities
 (e.g. inlining assembler functions).

Adding a keyword won't fix the current problem that the compiler won't inline 
inline assembler functions. It's an orthogonal issue.

I know there are features on various C compilers to force inlining, I know 
there's a demand for them.

But I've also, over the years, spent thousands and thousands of hours
optimizing 
the hell out of things, so I have some experience with it.

Once the compiler gets past a certain level of heuristic inlining decisions, 
forcing it to inline more is just chasing rainbows.

And if one really wants to force an inline, one can do things like the C memcpy 
using the preprocessor, or string mixins in D, or even cut&paste. If you need
to 
do that in more than a couple places in the code, something else is wrong (that 
old saw about only a tiny percentage of the code being a bottleneck is true).

Also, if you are tweaking at such a level, every compiler is different enough 
that your tweaks are likely to be counterproductive on another compiler. Having 
a portable syntax for such tweaking is not going to help.

Dec 30 2011

Peter Alexander <peter.alexander.au gmail.com> writes:

On 30/12/11 9:13 AM, Walter Bright wrote:
 And if one really wants to force an inline, one can do things like the C
 memcpy using the preprocessor, or string mixins in D, or even cut&paste.
 If you need to do that in more than a couple places in the code,
 something else is wrong (that old saw about only a tiny percentage of
 the code being a bottleneck is true).

When you are writing really performance sensitive code, that old adage 
is certainly *not* true.

It only happens in practice when you don't care that much about 
performance. When you really care, you've already optimised those hot 
spots, so what you end up with is a completely flat profile: no part of 
the program is the bottleneck, but the whole thing is.

At that point, you're likely suffering a death from a thousand cuts: no 
single part of your program is the bottleneck; your poor performance is 
just the sum total of a bunch of small performance penalties here and there.

A perfect example of this is vector operations. Games use vector 
operations all over the place, so their impact on performance is spread 
out over the entire program. You'll never see a dot product or vector 
addition routine at the top of a profile chart, but it will certainly 
affect performance!

Dec 30 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Friday, 30 December 2011 at 09:13:05 UTC, Walter Bright wrote:
 Also, if you are tweaking at such a level, every compiler is 
 different enough that your tweaks are likely to be 
 counterproductive on another compiler. Having a portable syntax 
 for such tweaking is not going to help.

Which is exactly why I think an inlining pragma/attribute should 
provide a guarantee, and not a hint. It's a web of 
assumptions/guarantees: asm blocks provide their guarantees, but 
using them introduces new assumptions, that e.g. force-inlining 
solidifies, etc.

Back to the macros vs.

 And if one really wants to force an inline, one can do things 
 like the C memcpy using the preprocessor, or string mixins in 
 D, or even cut&paste.

D has nothing from the above that's elegant and maintainable. 
Timon's solution comes close, but it uses a DSL to make up for 
what the language doesn't provide.

 If you need to do that in more than a couple places in the 
 code, something else is wrong (that old saw about only a tiny 
 percentage of the code being a bottleneck is true).

What about the context of creating an optimized library, as 
opposed to optimizing one application?

Dec 30 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Friday, 30 December 2011 at 12:00:07 UTC, Vladimir Panteleev 
wrote:
 Back to the macros vs.

Oops, didn't mean to send that. I was going to write that 
comparing C macros with __forceinline functions is much more of a 
level comparison.

Dec 30 2011

so <so so.so> writes:

On Fri, 30 Dec 2011 14:00:06 +0200, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Friday, 30 December 2011 at 09:13:05 UTC, Walter Bright wrote:
 Also, if you are tweaking at such a level, every compiler is different  
 enough that your tweaks are likely to be counterproductive on another  
 compiler. Having a portable syntax for such tweaking is not going to  
 help.

 Which is exactly why I think an inlining pragma/attribute should provide  
 a guarantee, and not a hint. It's a web of assumptions/guarantees: asm  
 blocks provide their guarantees, but using them introduces new  
 assumptions, that e.g. force-inlining solidifies, etc.

I agree  inline (which will probably be an extension) in D should mean  
force-inline.

Ignoring the impossible-to-inline cases (which in time should get better),  
adding  inline is a few minutes of editing.
It will just bypass the cost function and if it is not possible to inline,  
pop error. I don't have enough knowledge of DMD internals
so i am not sure if should go do it, or maybe i need to start somewhere...

Dec 30 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 12/30/2011 7:06 AM, so wrote:
 I agree  inline (which will probably be an extension) in D should mean
 force-inline.
 Ignoring the impossible-to-inline cases (which in time should get better),
 adding  inline is a few minutes of editing.
 It will just bypass the cost function and if it is not possible to inline, pop
 error.


Sure, but I think you'll be very disappointed in that it isn't going to deliver 
the goods.

Dec 30 2011

Chad J <chadjoan __spam.is.bad__gmail.com> writes:

On 12/30/2011 01:48 PM, Walter Bright wrote:
 On 12/30/2011 7:06 AM, so wrote:
 I agree  inline (which will probably be an extension) in D should mean
 force-inline.
 Ignoring the impossible-to-inline cases (which in time should get
 better),
 adding  inline is a few minutes of editing.
 It will just bypass the cost function and if it is not possible to
 inline, pop
 error.

 
 
 Sure, but I think you'll be very disappointed in that it isn't going to
 deliver the goods.

Cool.  Put it in and let people use it and get disappointed.  Then maybe
they will blame themselves instead of DMD.  ????.  Profit.

Dec 30 2011

so <so so.so> writes:

On Fri, 30 Dec 2011 20:48:54 +0200, Walter Bright  
<newshound2 digitalmars.com> wrote:

 On 12/30/2011 7:06 AM, so wrote:
 I agree  inline (which will probably be an extension) in D should mean
 force-inline.
 Ignoring the impossible-to-inline cases (which in time should get  
 better),
 adding  inline is a few minutes of editing.
 It will just bypass the cost function and if it is not possible to  
 inline, pop
 error.


 Sure, but I think you'll be very disappointed in that it isn't going to  
 deliver the goods.

dmd_inl -O -inline test.d
dmd_inl -O -inline test_inl.d
time ./test

real    0m4.686s
user    0m3.516s
sys     0m0.007s
time ./test_inl

real    0m1.900s
user    0m1.503s
sys     0m0.007s
time ./test

real    0m4.381s
user    0m3.520s
sys     0m0.010s
time ./test_inl

real    0m1.955s
user    0m1.473s
sys     0m0.037s
time ./test

real    0m4.473s
user    0m3.506s
sys     0m0.017s
time ./test_inl

real    0m1.836s
user    0m1.507s
sys     0m0.007s
time ./test

real    0m4.627s
user    0m3.523s
sys     0m0.003s
time ./test_inl

real    0m1.984s
user    0m1.480s
sys     0m0.030s

Just bypassing cost escape, I ll try some complex cases soon after i get  
phobos working.

int test() // test.d
int test()  inline // test_inl.d
{
	int i = 0;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	++i;
	return i;
}

void main()
{
	for(uint i=0; i<1_000_000_000; ++i)
		test();
}

Dec 30 2011

Iain Buclaw <ibuclaw ubuntu.com> writes:

On 31 December 2011 00:48, so <so so.so> wrote:
 On Fri, 30 Dec 2011 20:48:54 +0200, Walter Bright
 <newshound2 digitalmars.com> wrote:

 On 12/30/2011 7:06 AM, so wrote:
 I agree  inline (which will probably be an extension) in D should mean
 force-inline.
 Ignoring the impossible-to-inline cases (which in time should get
 better),
 adding  inline is a few minutes of editing.
 It will just bypass the cost function and if it is not possible to
 inline, pop
 error.



 Sure, but I think you'll be very disappointed in that it isn't going to
 deliver the goods.


 dmd_inl -O -inline test.d
 dmd_inl -O -inline test_inl.d
 time ./test

 real =A0 =A00m4.686s
 user =A0 =A00m3.516s
 sys =A0 =A0 0m0.007s
 time ./test_inl

 real =A0 =A00m1.900s
 user =A0 =A00m1.503s
 sys =A0 =A0 0m0.007s
 time ./test

*SNIP*

 void main()
 {
 =A0 =A0 =A0 =A0for(uint i=3D0; i<1_000_000_000; ++i)
 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0test();
 }

A better compiler would see that the function 'test' has no side
effects, and it's return value is unused, so elimates the call to it
completely as dead code.

--=20
Iain Buclaw

*(p < e ? p++ : p) =3D (c & 0x0f) + '0';

Dec 30 2011

so <so so.so> writes:

On Sat, 31 Dec 2011 03:12:38 +0200, Iain Buclaw <ibuclaw ubuntu.com> wrote:

 On 31 December 2011 00:48, so <so so.so> wrote:
 On Fri, 30 Dec 2011 20:48:54 +0200, Walter Bright
 <newshound2 digitalmars.com> wrote:

 On 12/30/2011 7:06 AM, so wrote:
 I agree  inline (which will probably be an extension) in D should mean
 force-inline.
 Ignoring the impossible-to-inline cases (which in time should get
 better),
 adding  inline is a few minutes of editing.
 It will just bypass the cost function and if it is not possible to
 inline, pop
 error.



 Sure, but I think you'll be very disappointed in that it isn't going to
 deliver the goods.


 dmd_inl -O -inline test.d
 dmd_inl -O -inline test_inl.d
 time ./test

 real    0m4.686s
 user    0m3.516s
 sys     0m0.007s
 time ./test_inl

 real    0m1.900s
 user    0m1.503s
 sys     0m0.007s
 time ./test

 *SNIP*

 void main()
 {
        for(uint i=0; i<1_000_000_000; ++i)
                test();
 }

 A better compiler would see that the function 'test' has no side
 effects, and it's return value is unused, so elimates the call to it
 completely as dead code.

It is just a dummy function that dmd rejected to inline, send me a better  
one (which won't use any libraries) and i'll use it :)

Dec 30 2011

Iain Buclaw <ibuclaw ubuntu.com> writes:

On 31 December 2011 01:21, so <so so.so> wrote:
 On Sat, 31 Dec 2011 03:12:38 +0200, Iain Buclaw <ibuclaw ubuntu.com> wrot=

e:
 On 31 December 2011 00:48, so <so so.so> wrote:
 On Fri, 30 Dec 2011 20:48:54 +0200, Walter Bright
 <newshound2 digitalmars.com> wrote:

 On 12/30/2011 7:06 AM, so wrote:
 I agree  inline (which will probably be an extension) in D should mea=





n
 force-inline.
 Ignoring the impossible-to-inline cases (which in time should get
 better),
 adding  inline is a few minutes of editing.
 It will just bypass the cost function and if it is not possible to
 inline, pop
 error.




 Sure, but I think you'll be very disappointed in that it isn't going t=




o
 deliver the goods.



 dmd_inl -O -inline test.d
 dmd_inl -O -inline test_inl.d
 time ./test

 real =A0 =A00m4.686s
 user =A0 =A00m3.516s
 sys =A0 =A0 0m0.007s
 time ./test_inl

 real =A0 =A00m1.900s
 user =A0 =A00m1.503s
 sys =A0 =A0 0m0.007s
 time ./test

 *SNIP*


 void main()
 {
 =A0 =A0 =A0 for(uint i=3D0; i<1_000_000_000; ++i)
 =A0 =A0 =A0 =A0 =A0 =A0 =A0 test();
 }


 A better compiler would see that the function 'test' has no side
 effects, and it's return value is unused, so elimates the call to it
 completely as dead code.


 It is just a dummy function that dmd rejected to inline, send me a better
 one (which won't use any libraries) and i'll use it :)

Take a pick of any examples posted on this ML.  They are far better
fit to use as a test bed.  Ideally one that does number crunching and
can't be easily folded away.

Regards

--=20
Iain Buclaw

*(p < e ? p++ : p) =3D (c & 0x0f) + '0';

Dec 30 2011

so <so so.so> writes:

On Sat, 31 Dec 2011 03:40:43 +0200, Iain Buclaw <ibuclaw ubuntu.com> wrote:

 Take a pick of any examples posted on this ML.  They are far better
 fit to use as a test bed.  Ideally one that does number crunching and
 can't be easily folded away.

Well not them but another dummy function, i didn't think it would differ  
this much.

time ./test_inl

real    0m0.013s
user    0m0.007s
sys     0m0.003s
time ./test

real    0m7.753s
user    0m5.966s
sys     0m0.013s
time ./test_inl

real    0m0.013s
user    0m0.010s
sys     0m0.000s
time ./test

real    0m7.391s
user    0m5.960s
sys     0m0.017s
time ./test_inl

real    0m0.014s
user    0m0.007s
sys     0m0.003s
time ./test

real    0m7.582s
user    0m5.950s
sys     0m0.030s


real test() // test.d
real test()  inline // test_inl.d
{
	real a=423123, b=432, c=10, d=100, e=4045, f=123;
	a = a / b * c / d + e - f;
	b = a / b * c / d + e - f;
	c = a / b * c / d + e - f;
	d = a / b * c / d + e - f;
	e = a / b * c / d + e - f;
	f = a / b * c / d + e - f;
	a = a / b * c / d + e - f;
	b = a / b * c / d + e - f;
	c = a / b * c / d + e - f;
	d = a / b * c / d + e - f;
	e = a / b * c / d + e - f;
	f = a / b * c / d + e - f;
	a = a / b * c / d + e - f;
	b = a / b * c / d + e - f;
	c = a / b * c / d + e - f;
	d = a / b * c / d + e - f;
	e = a / b * c / d + e - f;
	f = a / b * c / d + e - f;
	return f;
}

void main()
{
	for(uint i=0; i<1_000_000_0; ++i)
		test();
}

Dec 30 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 12/30/2011 5:59 PM, so wrote:
 Well not them but another dummy function, i didn't think it would differ this
much.

It differs that much because once it is inlined, the optimizer deletes it 
because it does nothing.

I don't think it is a valid test.

Dec 30 2011

so <so so.so> writes:

On Sat, 31 Dec 2011 04:30:01 +0200, Walter Bright  
<newshound2 digitalmars.com> wrote:

 On 12/30/2011 5:59 PM, so wrote:
 Well not them but another dummy function, i didn't think it would  
 differ this much.

 It differs that much because once it is inlined, the optimizer deletes  
 it because it does nothing.

 I don't think it is a valid test.

Yes i can see from asm output but are we talking about same thing?
 inline IS all about that. We can try it with any example, one  
outperforming the other is not the point.

for(....)
    fun()

With or without  inline i know fun should/will get folded away, then why  
should i pay for the function call?

Dec 30 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 12/30/2011 6:35 PM, so wrote:
 With or without  inline i know fun should/will get folded away, then why should
 i pay for the function call?

Because if the function did anything useful, the overhead of the function call 
is insignificant.

I don't think that dealing with large, complex functions that do nothing merits 
a language extension.

Dec 30 2011

Mike Wey <mike-wey example.com> writes:

On 12/31/2011 02:59 AM, so wrote:
 On Sat, 31 Dec 2011 03:40:43 +0200, Iain Buclaw <ibuclaw ubuntu.com> wrote:

 Take a pick of any examples posted on this ML. They are far better
 fit to use as a test bed. Ideally one that does number crunching and
 can't be easily folded away.

 Well not them but another dummy function, i didn't think it would differ
 this much.

real test() nothrow pure

 real test() // test.d
 real test()  inline // test_inl.d
 {
 real a=423123, b=432, c=10, d=100, e=4045, f=123;
 a = a / b * c / d + e - f;
 b = a / b * c / d + e - f;
 c = a / b * c / d + e - f;
 d = a / b * c / d + e - f;
 e = a / b * c / d + e - f;
 f = a / b * c / d + e - f;
 a = a / b * c / d + e - f;
 b = a / b * c / d + e - f;
 c = a / b * c / d + e - f;
 d = a / b * c / d + e - f;
 e = a / b * c / d + e - f;
 f = a / b * c / d + e - f;
 a = a / b * c / d + e - f;
 b = a / b * c / d + e - f;
 c = a / b * c / d + e - f;
 d = a / b * c / d + e - f;
 e = a / b * c / d + e - f;
 f = a / b * c / d + e - f;
 return f;
 }

 void main()
 {
 for(uint i=0; i<1_000_000_0; ++i)
 test();
 }

When marking the function as pure and nothrow dmd is able to optimize 
the loop:

.text._Dmain	segment
	assume	CS:.text._Dmain
_Dmain:
		push	RBP
		mov	RBP,RSP
		xor	EAX,EAX
L6:		inc	EAX
		cmp	EAX,0989680h
		jb	L6
		xor	EAX,EAX
		pop	RBP
		ret
.text._Dmain	ends


-- 
Mike Wey

Dec 31 2011

Iain Buclaw <ibuclaw ubuntu.com> writes:

On 31 December 2011 13:05, Mike Wey <mike-wey example.com> wrote:
 On 12/31/2011 02:59 AM, so wrote:
 On Sat, 31 Dec 2011 03:40:43 +0200, Iain Buclaw <ibuclaw ubuntu.com>
 wrote:

 Take a pick of any examples posted on this ML. They are far better
 fit to use as a test bed. Ideally one that does number crunching and
 can't be easily folded away.


 Well not them but another dummy function, i didn't think it would differ
 this much.

 real test() nothrow pure


 real test() // test.d
 real test()  inline // test_inl.d
 {
 real a=3D423123, b=3D432, c=3D10, d=3D100, e=3D4045, f=3D123;
 a =3D a / b * c / d + e - f;
 b =3D a / b * c / d + e - f;
 c =3D a / b * c / d + e - f;
 d =3D a / b * c / d + e - f;
 e =3D a / b * c / d + e - f;
 f =3D a / b * c / d + e - f;
 a =3D a / b * c / d + e - f;
 b =3D a / b * c / d + e - f;
 c =3D a / b * c / d + e - f;
 d =3D a / b * c / d + e - f;
 e =3D a / b * c / d + e - f;
 f =3D a / b * c / d + e - f;
 a =3D a / b * c / d + e - f;
 b =3D a / b * c / d + e - f;
 c =3D a / b * c / d + e - f;
 d =3D a / b * c / d + e - f;
 e =3D a / b * c / d + e - f;
 f =3D a / b * c / d + e - f;
 return f;
 }

 void main()
 {
 for(uint i=3D0; i<1_000_000_0; ++i)
 test();
 }


 When marking the function as pure and nothrow dmd is able to optimize the
 loop:

 .text._Dmain =A0 =A0segment
 =A0 =A0 =A0 =A0assume =A0CS:.text._Dmain
 _Dmain:
 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0push =A0 =A0RBP
 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0mov =A0 =A0 RBP,RSP
 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0xor =A0 =A0 EAX,EAX
 L6: =A0 =A0 =A0 =A0 =A0 =A0 inc =A0 =A0 EAX
 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0cmp =A0 =A0 EAX,0989680h
 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0jb =A0 =A0 =A0L6
 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0xor =A0 =A0 EAX,EAX
 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0pop =A0 =A0 RBP
 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0ret
 .text._Dmain =A0 =A0ends


 --
 Mike Wey

Yep, as I've mentioned earlier, the function has no side effects and
it's return value is not used, hence can be optimised away completely.

--=20
Iain Buclaw

*(p < e ? p++ : p) =3D (c & 0x0f) + '0';

Dec 31 2011

so <so so.so> writes:

On Sat, 31 Dec 2011 03:40:43 +0200, Iain Buclaw <ibuclaw ubuntu.com> wrote:

 Take a pick of any examples posted on this ML.  They are far better
 fit to use as a test bed.  Ideally one that does number crunching and
 can't be easily folded away.

I don't understand your point btw, why it shouldn't be easily folded away?
 inline is exactly for that reason, why would i pay for something i don't  
want?

Dec 30 2011

Chad J <chadjoan __spam.is.bad__gmail.com> writes:

On 12/30/2011 04:13 AM, Walter Bright wrote:
 On 12/30/2011 12:16 AM, Vladimir Panteleev wrote:
 I agree, but this wasn't as much about heuristics, but compiler
 capabilities
 (e.g. inlining assembler functions).

 
 Adding a keyword won't fix the current problem that the compiler won't
 inline inline assembler functions. It's an orthogonal issue.
 
 I know there are features on various C compilers to force inlining, I
 know there's a demand for them.
 
 But I've also, over the years, spent thousands and thousands of hours
 optimizing the hell out of things, so I have some experience with it.
 
 Once the compiler gets past a certain level of heuristic inlining
 decisions, forcing it to inline more is just chasing rainbows.
 

When a compiler ISN'T past a certain level of heuristic inlining, then
being able to tell it to inline can save one's ass.

I hit this when writing a flash game.  It was doing the slide-show thing
while on a collision detection broadphase (IIRC) when it went to sort
everything.  The language I was using, haXe, was pretty young at the
time and the compiler probably wasn't inlining well.  BUT, it did have
an inline keyword.  I plopped it down in a few select places and BAM,the
broadphase is ~100x faster and life goes on.  Things were going to get
really damn ugly if I couldn't do that.  (haXe is a pretty cool
language, just not as featureful as D.)

Nonetheless, this is the less important issue...

 And if one really wants to force an inline, one can do things like the C
 memcpy using the preprocessor, or string mixins in D, or even cut&paste.
 If you need to do that in more than a couple places in the code,
 something else is wrong (that old saw about only a tiny percentage of
 the code being a bottleneck is true).
 
 Also, if you are tweaking at such a level, every compiler is different
 enough that your tweaks are likely to be counterproductive on another
 compiler. Having a portable syntax for such tweaking is not going to help.

This is striking me as becoming a human factors problem.  People want a
way to tell the compiler to inline things.  They are /going/ to get
that, one way or another.  It /will/ happen, regardless of how
experienced /you/ are.  They also may not go about it in entirely
reasonable ways, and then you end up with code optimized for one
compiler that doesn't compile at all on another.  This sucks really bad
for people compiling a program that they didn't write.

And to me, that's what I worry about most.

...

As an aside, I think that people want forced inlining because it gives
them another tool to tweak with.  My experiences with optimization tend
to suggest I can usually optimize things really well with a few short
cycles of profile->experiment->profile.  I don't think I've ever really
/needed/ to dive into assembly yet.  My ventures into the assembler have
been either purely recreational or academic in nature.  Now, something
like an inline feature can help a lot with the "experiment" part of the
cycle.  It's just another knob to twist and see if it gives the result
you want.  Portability be damned, if it gets the thing out the door, I'm
using it!  But, I kind of hate that attitude.  So it's much more
comforting to be able to twist that knob without sacrificing portability
too.  I wouldn't expect it to run as fast on other compilers; I /would/
expect it to compile and run correctly on other compilers.  And if
enregistering variables is more important, then we might want to have a
way to enregister variables too.

Dec 30 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 12/30/2011 10:31 AM, Chad J wrote:
 As an aside, I think that people want forced inlining because it gives
 them another tool to tweak with.  My experiences with optimization tend
 to suggest I can usually optimize things really well with a few short
 cycles of profile->experiment->profile.  I don't think I've ever really
 /needed/ to dive into assembly yet.  My ventures into the assembler have
 been either purely recreational or academic in nature.  Now, something
 like an inline feature can help a lot with the "experiment" part of the
 cycle.  It's just another knob to twist and see if it gives the result
 you want.  Portability be damned, if it gets the thing out the door, I'm
 using it!  But, I kind of hate that attitude.  So it's much more
 comforting to be able to twist that knob without sacrificing portability
 too.  I wouldn't expect it to run as fast on other compilers; I /would/
 expect it to compile and run correctly on other compilers.  And if
 enregistering variables is more important, then we might want to have a
 way to enregister variables too.

Back in the olden days, I provided a detailed list of optimizer switches that 
turned on/off all sorts of optimizations. In the end it turned out that all 
people wanted was an "optimize" switch which is why dmd has only -O.

The reason dmd has a -inline switch is because it's hard to debug code that has 
been inlined.

The reason C's "register" keyword went away was because:

1. the variables after optimization transformations may be very different than 
before

2. programmers stunk at picking the right variables for registers

3. even if (2) was done right, as soon as the first code maintainer dinked with 
it, they never bothered to go fix the register declarations

4. optimizers got pretty good at automatic register allocation

5. there's nothing portable about enregistering, even with a portable syntax

6. the register keyword offered no way to hint which variables were more 
important to enregister than others

Dec 30 2011

Chad J <chadjoan __spam.is.bad__gmail.com> writes:

On 12/30/2011 02:00 PM, Walter Bright wrote:
 On 12/30/2011 10:31 AM, Chad J wrote:
 As an aside, I think that people want forced inlining because it gives
 them another tool to tweak with.  My experiences with optimization tend
 to suggest I can usually optimize things really well with a few short
 cycles of profile->experiment->profile.  I don't think I've ever really
 /needed/ to dive into assembly yet.  My ventures into the assembler have
 been either purely recreational or academic in nature.  Now, something
 like an inline feature can help a lot with the "experiment" part of the
 cycle.  It's just another knob to twist and see if it gives the result
 you want.  Portability be damned, if it gets the thing out the door, I'm
 using it!  But, I kind of hate that attitude.  So it's much more
 comforting to be able to twist that knob without sacrificing portability
 too.  I wouldn't expect it to run as fast on other compilers; I /would/
 expect it to compile and run correctly on other compilers.  And if
 enregistering variables is more important, then we might want to have a
 way to enregister variables too.

 
 Back in the olden days, I provided a detailed list of optimizer switches
 that turned on/off all sorts of optimizations. In the end it turned out
 that all people wanted was an "optimize" switch which is why dmd has
 only -O.
 
 The reason dmd has a -inline switch is because it's hard to debug code
 that has been inlined.
 
 The reason C's "register" keyword went away was because:
 
 1. the variables after optimization transformations may be very
 different than before
 
 2. programmers stunk at picking the right variables for registers
 
 3. even if (2) was done right, as soon as the first code maintainer
 dinked with it, they never bothered to go fix the register declarations
 
 4. optimizers got pretty good at automatic register allocation
 
 5. there's nothing portable about enregistering, even with a portable
 syntax
 
 6. the register keyword offered no way to hint which variables were more
 important to enregister than others

Huh, bummer dudes.

6 seems pretty solvable.  Too bad about the other 5.  ;)

Dec 30 2011

Trass3r <un known.com> writes:

 I completely agree that DMD's inliner is underpowered and needs  
 improvement. I am less sure that this demonstrates that the language  
 needs changes.

 Functions below a certain size should be inlined if possible. Those  
 above that size do not benefit perceptibly from inlining. Where that  
 certain size exactly is, who knows, but I doubt that functions near that  
 size will benefit much from user intervention.

More specifically a distinction would be nice like gcc does:

"-finline-small-functions
Integrate functions into their callers when their body is smaller than  
expected function call code (so overall size of program gets smaller). The  
compiler heuristically decides which functions are simple enough to be  
worth integrating in this way.

Enabled at level -O2.

-finline-functions
Integrate all simple functions into their callers. The compiler  
heuristically decides which functions are simple enough to be worth  
integrating in this way.

Enabled at level -O3."

Dec 30 2011

"Martin Nowak" <dawg dawgfoto.de> writes:

On Fri, 30 Dec 2011 06:51:44 +0100, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Thursday, 29 December 2011 at 19:47:39 UTC, Walter Bright wrote:
 On 12/29/2011 3:19 AM, Vladimir Panteleev wrote:
 I'd like to invite you to translate Daniel Vik's C memcpy  
 implementation to D:
 http://www.danielvik.com/2010/02/fast-memcpy-in-c.html

 Challenge accepted.

 Ah, a direct translation using functions! This is probably the most  
 elegant approach, however - as I'm sure you've noticed - the programmer  
 has no control over what gets inlined.

 Examining the assembler output, it inlines everything except  
 COPY_SHIFT, COPY_NO_SHIFT, and COPY_REMAINING. The inliner in dmd could  
 definitely be improved, but that is not a problem with the language,  
 but the implementation.

 This is the problem with heuristic inlining: while great by itself, in a  
 position such as this the programmer is left with no choice but to  
 examine the assembler output to make sure the compiler does what the  
 programmer wants it to do. Such behavior can change from one  
 implementation to another, and even from one compiler version to  
 another. (After all, I don't think that we can guarantee that what's  
 inlined today, will be inlined tomorrow.)

For real performance bottlenecks one should always examine the assembly.
For most code inlining hardly ever matters for the runtime of your
program and focusing on efficient algorithms is most important.

What really baffles me is that people want control over inlining
but nobody seems to ever have noticed that x64 switch doesn't switch
and x64 vector ops aren't vectorized. Both of which are really
important in performance sensitive code.

 Continuing in that vein, please note that neither C nor C++ require  
 inlining of any sort. The "inline" keyword is merely a hint to the  
 compiler. What inlining takes place is completely implementation  
 defined, not language defined.

 I think we can agree that the C inline hint is of limited use. However,  
 major C compiler vendors implement an extension to force inlining.  
 Generally, I would say that common vendor extensions seen in other  
 languages are an opportunity for D to avoid a similar mess: such  
 extensions would not have to be required to be implemented, but when  
 they are, they would use the same syntax across implementations.

 I wish to note that the D version semantically accomplishes the same  
 thing as the C version without using mixins or CTFE - it's all  
 straightforward code, without the abusive preprocessor tricks.

 I don't think there's much value in that statement. After all, except  
 for a few occasional templates (which weren't strictly necessary), your  
 translation uses few D-specific features. If you were to leave yourself  
 at the mercy of a C compiler's optimizer, your rewrite would merely be a  
 testament against C macros, not the power of D.

 However, the most important part is: this translation is incorrect. C  
 macros in the original code provide a guarantee that the code is  
 inlined. D cannot make such guarantees - even your amended version is  
 tuned to one specific implementation (and possibly, only a specific  
 range of versions of it).

Jan 03 2012

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Tuesday, 3 January 2012 at 18:49:35 UTC, Martin Nowak wrote:
 For real performance bottlenecks one should always examine the 
 assembly. For most code inlining hardly ever matters for the 
 runtime of your program and focusing on efficient algorithms is 
 most important.

 What really baffles me is that people want control over inlining
 but nobody seems to ever have noticed that x64 switch doesn't 
 switch and x64 vector ops aren't vectorized. Both of which are 
 really important in performance sensitive code.

Quality of implementations' optimizations and a common syntax for 
code compilation guarantees are orthogonal issues.

Jan 04 2012

Timon Gehr <timon.gehr gmx.ch> writes:

On 12/29/2011 12:19 PM, Vladimir Panteleev wrote:
 On Thursday, 29 December 2011 at 09:16:23 UTC, Walter Bright wrote:
 Are you a ridiculous hacker? Inline x86 assembly that the compiler
 actually understands in 32 AND 64 bit code, hex string literals like
 x"DE ADB EEF" where spacing doesn't matter, the ability to set data
 alignment cross-platform with type.alignof = 16, load your shellcode
 verbatim into a string like so: auto str = import("shellcode.txt");

 I would like to talk about this for a bit. Personally, I think D's
 system programming abilities are only half-way there. Note that I am not
 talking about use cases in high-level application code, but rather
 low-level, widely-used framework code, where every bit of performance
 matters (for example: memory copy routines, string builders, garbage
 collectors).

 In-line assembler as part of the language is certainly neat, and in fact
 coming from Delphi to C++ I was surprised to learn that C++
 implementations adopted different syntax for asm blocks. However,
 compared to some C++ compilers, it has severe limitations and is D's
 only trick in this alley.

 For one thing, there is no way to force the compiler to inline a
 function (like __forceinline / __attribute((always_inline)) ). This is
 fine for high-level code (where users are best left with PGO and "the
 compiler knows best"), but sucks if you need a guarantee that the
 function must be inlined. The guarantee isn't just about inlining
 heuristics, but also implementation capabilities. For example, some
 implementations might not be able to inline functions that use certain
 language features, and your code's performance could demand that such a
 short function must be inlined. One example of this is inlining
 functions containing asm blocks - IIRC DMD does not support this.

That does not mean the language does not support it, probably ldc and 
gdc can do it.

 The
 compiler should fail the build if it can't inline a function tagged with
  forceinline, instead of shrugging it off and failing silently, forcing
 users to check the disassembly every time.

+1. I think we should extend the 'enum' storage class to functions, and 
introduce cast(enum) to force instant evaluation.

void foo() enum{...} // always inlined or compile error. declaration 
alone does not contribute code to the object file
void goo(){...}      // inlined at compiler's discretion

void main(){
     (cast(enum)goo)(); // inlined or compile error
}

 You may have noticed that GCC has some ridiculously complicated
 assembler facilities. However, they also open the way to the
 possibilities of writing optimal code - for example, creating custom
 calling conventions, or inlining assembler functions without restricting
 the caller's register allocation with a predetermined calling
 convention. In contrast, DMD is very conservative when it comes to
 mixing D and assembler. One time I found that putting an asm block in a
 function turned what were single instructions into blocks of 6
 instructions each.

 D's lacking  in this area makes it impossible to create language features
 that are on the level of D's compiler built-ins. For example, I have
 tested three memcpy implementations recently, but none of them could
 beat DMD's standard array slice copy (despite that in release mode it
 compiles to a simple memcpy call). Why? Because the overhead of using a
 custom memcpy routine negated its performance gains.

I don't think you should use DMD to benchmark the D language.

 This might have been alleviated with the presence of sane macros, but no
 such luck. String mixins are not the answer: trying to translate
 macro-heavy C code to D using string mixins is string escape hell, and
 we're back to the level of shell scripts.

No string escape hell if you do it right.

 We've discussed this topic on IRC recently. From what I understood,
 Andrei thinks improvements in this area are not "impactful" enough,
 which I find worrisome.

Me too.

 Personally, I don't think D qualifies as a true "system programming
 language" in light of the above.

Neither do C or C++ without compiler specific extensions. We should 
definitely standardise such features in D.

 It's more of a compiled language with
 pointers and assembler. Before you disagree with any of the above, first
 (for starters) I'd like to invite you to translate Daniel Vik's C memcpy
 implementation to D:
 http://www.danielvik.com/2010/02/fast-memcpy-in-c.html . It doesn't even
 use inline assembler or compiler intrinsics.

OK, will do.

Dec 29 2011

David Nadlinger <see klickverbot.at> writes:

On 12/29/11 9:58 PM, Timon Gehr wrote:
 On 12/29/2011 12:19 PM, Vladimir Panteleev wrote:
 […]One example of this is inlining
 functions containing asm blocks - IIRC DMD does not support this.

 That does not mean the language does not support it, probably ldc and
 gdc can do it.

LDC has pragma(allow_inline), which allows you to mark a function 
containing inline asm as safe to inline.

David

Dec 29 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Thursday, 29 December 2011 at 20:58:59 UTC, Timon Gehr wrote:
 I don't think you should use DMD to benchmark the D language.

You're missing my point. We can't count that optimizers in all 
implementations will be perfect. I am suggesting language 
features which could provide guarantees to the programmer 
regarding how the code will be compiled. If an implementation 
cannot satisfy them, the programmer should be told so, so he 
could try something else - rather than having to sift through 
disassembler listings or use a profiler.

Dec 29 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 12/29/2011 12:19 PM, Vladimir Panteleev wrote:
 Before you disagree with any of the above, first
 (for starters) I'd like to invite you to translate Daniel Vik's C memcpy
 implementation to D:
 http://www.danielvik.com/2010/02/fast-memcpy-in-c.html . It doesn't even
 use inline assembler or compiler intrinsics.

Ok, I have performed a direct translation (with all the preprocessor 
stuff replaced by string mixins). However, I think I could do a lot 
better starting from scratch in D. I have performed some basic testing 
with all the configuration options, and it seems to work correctly.

// File: memcpy.d direct translation of memcpy.c

/********************************************************************
  ** File:     memcpy.c
  **
  ** Copyright (C) 1999-2010 Daniel Vik
  **
  ** This software is provided 'as-is', without any express or implied
  ** warranty. In no event will the authors be held liable for any
  ** damages arising from the use of this software.
  ** Permission is granted to anyone to use this software for any
  ** purpose, including commercial applications, and to alter it and
  ** redistribute it freely, subject to the following restrictions:
  **
  ** 1. The origin of this software must not be misrepresented; you
  **    must not claim that you wrote the original software. If you
  **    use this software in a product, an acknowledgment in the
  **    use this software in a product, an acknowledgment in the
  **    product documentation would be appreciated but is not
  **    required.
  **
  ** 2. Altered source versions must be plainly marked as such, and
  **    must not be misrepresented as being the original software.
  **
  ** 3. This notice may not be removed or altered from any source
  **    distribution.
  **
  **
  ** Description: Implementation of the standard library function memcpy.
  **             This implementation of memcpy() is ANSI-C89 compatible.
  **
  **             The following configuration options can be set:
  **
  **           LITTLE_ENDIAN   - Uses processor with little endian
  **                             addressing. Default is big endian.
  **
  **           PRE_INC_PTRS    - Use pre increment of pointers.
  **                             Default is post increment of
  **                             pointers.
  **
  **           INDEXED_COPY    - Copying data using array indexing.
  **                             Using this option, disables the
  **                             PRE_INC_PTRS option.
  **
  **           MEMCPY_64BIT    - Compiles memcpy for 64 bit
  **                             architectures
  **
  **
  ** Best Settings:
  **
  ** Intel x86:  LITTLE_ENDIAN and INDEXED_COPY
  **
  *******************************************************************/


/********************************************************************
  ** Configuration definitions.
  *******************************************************************/

version = LITTLE_ENDIAN;
version = INDEXED_COPY;


/********************************************************************
  ** Includes for size_t definition
  *******************************************************************/

/********************************************************************
  ** Typedefs
  *******************************************************************/

version(MEMCPY_64BIT) version(D_LP32) static assert(0, "not a 64 bit 
compile");
version(D_LP64){
     alias ulong              UIntN;
     enum TYPE_WIDTH =        8;
}else{
     alias uint               UIntN;
     enum TYPE_WIDTH =        4;
}


/********************************************************************
  ** Remove definitions when INDEXED_COPY is defined.
  *******************************************************************/

version(INDEXED_COPY){
     version(PRE_INC_PTRS)
         static assert(0, "cannot use INDEXED_COPY together with 
PRE_INC_PTRS!");
}

/********************************************************************
  ** The X template
  *******************************************************************/

string Ximpl(string x){
     import utf = std.utf;
     string r=`"`;
     for(typeof(x.length) 
i=0;i<x.length;r~=x[i..i+utf.stride(x,i)],i+=utf.stride(x,i)){
         if(x[i]==' '&&x[i+1]=='('){
             auto start = ++i; int nest=1;
             while(nest){
                 i+=utf.stride(x,i);
                 if(x[i]=='(') nest++;
                 else if(x[i]==')') nest--;
             }
             i++;
             r~=`"~`~x[start..i]~`~"`;
             if(i==x.length) break;
         }
         if(x[i]=='"'||x[i]=='\\'){r~="\\"; continue;}
     }
     return r~`"`;
}

template X(string x){
     enum X = Ximpl(x);
}


/********************************************************************
  ** Definitions for pre and post increment of pointers.
  *******************************************************************/

// uses *(*&x)++ and similar to work around a bug in the parser

version(PRE_INC_PTRS){
     string START_VAL(string x)           {return mixin(X!q{(*& (x))--;});}
     string INC_VAL(string x)             {return mixin(X!q{*++(*& (x))});}
     string CAST_TO_U8(string p, string o){
         return mixin(X!q{(cast(ubyte*) (p) +  (o) + TYPE_WIDTH)});
     }
     enum WHILE_DEST_BREAK  =                     (TYPE_WIDTH - 1);
     enum PRE_LOOP_ADJUST   =                     q{- (TYPE_WIDTH - 1)};
     enum PRE_SWITCH_ADJUST =                     q{+ 1};
}else{
     string START_VAL(string x)           {return q{};}
     string INC_VAL(string x)             {return mixin(X!q{*(*& (x))++});}
     string CAST_TO_U8(string p, string o){
         return mixin(X!q{(cast(ubyte*) (p) +  (o))});
     }
     enum WHILE_DEST_BREAK  =                     0;
     enum PRE_LOOP_ADJUST   =                     q{};
     enum PRE_SWITCH_ADJUST =                     q{};
}




/********************************************************************
  ** Definitions for endians
  *******************************************************************/

version(LITTLE_ENDIAN){
     enum SHL = q{>>};
     enum SHR = q{<<};
}else{
     enum SHL = q{<<};
     enum SHR = q{>>};
}

/********************************************************************
  ** Macros for copying words of  different alignment.
  ** Uses incremening pointers.
  *******************************************************************/

string CP_INCR() {
     return mixin(X!q{
          (INC_VAL(q{dstN})) =  (INC_VAL(q{srcN}));
     });
}

string CP_INCR_SH(string shl, string shr) {
     return mixin(X!q{
         dstWord   = srcWord  (SHL)  (shl);
         srcWord   =  (INC_VAL(q{srcN}));
         dstWord  |= srcWord  (SHR)  (shr);
          (INC_VAL(q{dstN})) = dstWord;
     });
}



/********************************************************************
  ** Macros for copying words of  different alignment.
  ** Uses array indexes.
  *******************************************************************/

string CP_INDEX(string idx) {
     return mixin(X!q{
         dstN[ (idx)] = srcN[ (idx)];
     });
}

string CP_INDEX_SH(string x, string shl, string shr) {
     return mixin(X!q{
         dstWord   = srcWord  (SHL)  (shl);
         srcWord   = srcN[ (x)];
         dstWord  |= srcWord  (SHR)  (shr);
         dstN[ (x)]= dstWord;
     });
}



/********************************************************************
  ** Macros for copying words of different alignment.
  ** Uses incremening pointers or array indexes depending on
  ** configuration.
  *******************************************************************/

version(INDEXED_COPY){
     alias CP_INDEX CP;
     alias CP_INDEX_SH CP_SH;
     string INC_INDEX(string p, string o){
         return mixin(X!q{
             (( (p)) += ( (o)));
         });
     }
}else{
     string CP(string idx) {return mixin(X!q{ (CP_INCR())});}
     string CP_SH(string idx, string shl, string shr){
         return mixin(X!q{
              (CP_INCR_SH(mixin(X!q{ (shl)}), mixin(X!q{ (shr)})));
         });
     }
     string INC_INDEX(string p, string o){return q{};}
}


string COPY_REMAINING(string count) {
     return mixin(X!q{
          (START_VAL(q{dst8}));
          (START_VAL(q{src8}));

         switch ( (count)) {
         case 7:  (INC_VAL(q{dst8})) =  (INC_VAL(q{src8}));
         case 6:  (INC_VAL(q{dst8})) =  (INC_VAL(q{src8}));
         case 5:  (INC_VAL(q{dst8})) =  (INC_VAL(q{src8}));
         case 4:  (INC_VAL(q{dst8})) =  (INC_VAL(q{src8}));
         case 3:  (INC_VAL(q{dst8})) =  (INC_VAL(q{src8}));
         case 2:  (INC_VAL(q{dst8})) =  (INC_VAL(q{src8}));
         case 1:  (INC_VAL(q{dst8})) =  (INC_VAL(q{src8}));
         case 0:
         default: break;
         }
     });
}

string COPY_NO_SHIFT() {
     return mixin(X!q{
         UIntN* dstN = cast(UIntN*)(dst8  (PRE_LOOP_ADJUST));
         UIntN* srcN = cast(UIntN*)(src8  (PRE_LOOP_ADJUST));
         size_t length = count / TYPE_WIDTH;

         while (length & 7) {
              (CP_INCR());
             length--;
         }

         length /= 8;

         while (length--) {
              (CP(q{0}));
              (CP(q{1}));
              (CP(q{2}));
              (CP(q{3}));
              (CP(q{4}));
              (CP(q{5}));
              (CP(q{6}));
              (CP(q{7}));

              (INC_INDEX(q{dstN}, q{8}));
              (INC_INDEX(q{srcN}, q{8}));
         }

         src8 =  (CAST_TO_U8(q{srcN}, q{0}));
         dst8 =  (CAST_TO_U8(q{dstN}, q{0}));

          (COPY_REMAINING(q{count & (TYPE_WIDTH - 1)}));

         return dest;
     });
}



string COPY_SHIFT(string shift) {
     return mixin(X!q{
         UIntN* dstN  = cast(UIntN*)(((cast(UIntN)dst8) 
 (PRE_LOOP_ADJUST)) &
                                     ~(TYPE_WIDTH - 1));
         UIntN* srcN  = cast(UIntN*)(((cast(UIntN)src8) 
 (PRE_LOOP_ADJUST)) &
                                     ~(TYPE_WIDTH - 1));
         size_t length  = count / TYPE_WIDTH;
         UIntN srcWord =  (INC_VAL(q{srcN}));
         UIntN dstWord;

         while (length & 7) {
              (CP_INCR_SH(mixin(X!q{8 *  (shift)}), mixin(X!q{8 * 
(TYPE_WIDTH -  (shift))})));
             length--;
         }

         length /= 8;

         while (length--) {
              (CP_SH(q{0}, mixin(X!q{8 *  (shift)}), mixin(X!q{8 * 
(TYPE_WIDTH -  (shift))})));
              (CP_SH(q{1}, mixin(X!q{8 *  (shift)}), mixin(X!q{8 * 
(TYPE_WIDTH -  (shift))})));
              (CP_SH(q{2}, mixin(X!q{8 *  (shift)}), mixin(X!q{8 * 
(TYPE_WIDTH -  (shift))})));
              (CP_SH(q{3}, mixin(X!q{8 *  (shift)}), mixin(X!q{8 * 
(TYPE_WIDTH -  (shift))})));
              (CP_SH(q{4}, mixin(X!q{8 *  (shift)}), mixin(X!q{8 * 
(TYPE_WIDTH -  (shift))})));
              (CP_SH(q{5}, mixin(X!q{8 *  (shift)}), mixin(X!q{8 * 
(TYPE_WIDTH -  (shift))})));
              (CP_SH(q{6}, mixin(X!q{8 *  (shift)}), mixin(X!q{8 * 
(TYPE_WIDTH -  (shift))})));
              (CP_SH(q{7}, mixin(X!q{8 *  (shift)}), mixin(X!q{8 * 
(TYPE_WIDTH -  (shift))})));

              (INC_INDEX(q{dstN}, q{8}));
              (INC_INDEX(q{srcN}, q{8}));
         }

         src8 =  (CAST_TO_U8(q{srcN}, mixin(X!q{( (shift) - 
TYPE_WIDTH)})));
         dst8 =  (CAST_TO_U8(q{dstN}, q{0}));

          (COPY_REMAINING(q{count & (TYPE_WIDTH - 1)}));

         return dest;
     });
}


/********************************************************************
  **
  ** void *memcpy(void *dest, const void *src, size_t count)
  **
  ** Args:     dest        - pointer to destination buffer
  **           src         - pointer to source buffer
  **           count       - number of bytes to copy
  **
  ** Return:   A pointer to destination buffer
  **
  ** Purpose:  Copies count bytes from src to dest.
  **           No overlap check is performed.
  **
  *******************************************************************/

void *memcpy(void *dest, const void *src, size_t count)
{
     ubyte* dst8 = cast(ubyte*)dest;
     ubyte* src8 = cast(ubyte*)src;
     if (count < 8) {
         mixin(COPY_REMAINING(q{count}));
         return dest;
     }

     mixin(START_VAL(q{dst8}));
     mixin(START_VAL(q{src8}));

     while ((cast(UIntN)dst8 & (TYPE_WIDTH - 1)) != WHILE_DEST_BREAK) {
         mixin(INC_VAL(q{dst8})) = mixin(INC_VAL(q{src8}));
         count--;
     }
     switch ((mixin(`(cast(UIntN)src8)`~ PRE_SWITCH_ADJUST)) & 
(TYPE_WIDTH - 1)) {
     // { } required to work around DMD bug
     case 0: {mixin(COPY_NO_SHIFT());} break;
     case 1: {mixin(COPY_SHIFT(q{1}));}   break;
     case 2: {mixin(COPY_SHIFT(q{2}));}   break;
     case 3: {mixin(COPY_SHIFT(q{3}));}   break;
static if(TYPE_WIDTH > 4){ // was TYPE_WIDTH >= 4. bug in original code.
     case 4: {mixin(COPY_SHIFT(q{4}));}   break;
     case 5: {mixin(COPY_SHIFT(q{5}));}   break;
     case 6: {mixin(COPY_SHIFT(q{6}));}   break;
     case 7: {mixin(COPY_SHIFT(q{7}));}   break;
}
     default: assert(0);
     }
}


void main(){
     int[13] x = [1,2,3,4,5,6,7,8,9,0,1,2,3];
     int[13] y;
     memcpy(y.ptr, x.ptr, x.sizeof);
     import std.stdio;   writeln(y);
}

Dec 29 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Thursday, 29 December 2011 at 23:47:08 UTC, Timon Gehr wrote:
 ** The X template

Good work, but I'm not sure if inventing a DSL to make up for the 
problems in D string mixins that C macros don't have qualifies as 
"doing it right".

Dec 29 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 12/30/2011 06:58 AM, Vladimir Panteleev wrote:
 On Thursday, 29 December 2011 at 23:47:08 UTC, Timon Gehr wrote:
 ** The X template

 Good work, but I'm not sure if inventing a DSL to make up for the
 problems in D string mixins that C macros don't have qualifies as "doing
 it right".

It certainly does. That is how all my code generation looks like. The 
fact that I am using string mixins to solve some problems shows that 
those are not 'problems in D string mixins'.

Dec 30 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Friday, 30 December 2011 at 12:05:27 UTC, Timon Gehr wrote:
 On 12/30/2011 06:58 AM, Vladimir Panteleev wrote:
 On Thursday, 29 December 2011 at 23:47:08 UTC, Timon Gehr 
 wrote:
 ** The X template

 Good work, but I'm not sure if inventing a DSL to make up for 
 the
 problems in D string mixins that C macros don't have qualifies 
 as "doing
 it right".

 It certainly does. That is how all my code generation looks 
 like. The fact that I am using string mixins to solve some 
 problems shows that those are not 'problems in D string mixins'.

Never mind. You're right. I hadn't thought of this before (using 
DSL nesting to avoid breaking token nesting); it's a nice idea. I 
think I'll steal this for my code :)

Dec 30 2011

so <so so.so> writes:

On Fri, 30 Dec 2011 16:11:54 +0200, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Friday, 30 December 2011 at 12:05:27 UTC, Timon Gehr wrote:
 On 12/30/2011 06:58 AM, Vladimir Panteleev wrote:
 On Thursday, 29 December 2011 at 23:47:08 UTC, Timon Gehr wrote:
 ** The X template

 Good work, but I'm not sure if inventing a DSL to make up for the
 problems in D string mixins that C macros don't have qualifies as  
 "doing
 it right".

 It certainly does. That is how all my code generation looks like. The  
 fact that I am using string mixins to solve some problems shows that  
 those are not 'problems in D string mixins'.

 Never mind. You're right. I hadn't thought of this before (using DSL  
 nesting to avoid breaking token nesting); it's a nice idea. I think I'll  
 steal this for my code :)

For me, mixin sounds much more intuitive than inline for what we are  
trying to achieve with force-inline.
If it was user friendly now that would be awesome.

Dec 30 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 12/30/2011 4:05 AM, Timon Gehr wrote:
 It certainly does. That is how all my code generation looks like. The fact that
 I am using string mixins to solve some problems shows that those are not
 'problems in D string mixins'.

I your solution to parameterized strings is very nice. Can you write a brief 
article about it? This should be more widely known.

Dec 30 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 12/30/11 12:10 PM, Walter Bright wrote:
 On 12/30/2011 4:05 AM, Timon Gehr wrote:
 It certainly does. That is how all my code generation looks like. The
 fact that
 I am using string mixins to solve some problems shows that those are not
 'problems in D string mixins'.

 I your solution to parameterized strings is very nice. Can you write a
 brief article about it? This should be more widely known.

The idea is good, but nonhygienic: the macro's expansion picks up 
symbols from the expansion context.

Timon, to move from good to great, you may want to add parameters to the 
expansion process such that you replace the argument values during 
expansion.


Andrei

Dec 30 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 12/30/2011 09:51 PM, Andrei Alexandrescu wrote:
 On 12/30/11 12:10 PM, Walter Bright wrote:
 On 12/30/2011 4:05 AM, Timon Gehr wrote:
 It certainly does. That is how all my code generation looks like. The
 fact that
 I am using string mixins to solve some problems shows that those are not
 'problems in D string mixins'.

 I your solution to parameterized strings is very nice. Can you write a
 brief article about it? This should be more widely known.

 The idea is good, but nonhygienic: the macro's expansion picks up
 symbols from the expansion context.

What the template 'X' currently achieves is an improvement in syntax:

string generated = "foo!\""~x~"\"(\""~bar(y)~"\")";

vs

string generated = mixin(X!q{
    foo!" (x)"(" (bar(y))")
});

i.e. it is assumed that the generated code that results in a string 
expression will be mixed in right away. Kenji Hara's string mixin 
template proposal could be pulled to be able to enforce this at the same 
time as improving the syntax further:

mixin template X(string s){enum X = XImpl(s);}

string generated = X!q{
     foo!" (x)"(" (bar(y))")
}



 Timon, to move from good to great, you may want to add parameters to the
 expansion process such that you replace the argument values during
 expansion.


I like this:

string QUX(string param1, string op, string param2){
     return mixin(X!q{
          ("__"~param1)  (op)  (param2~"__");
     };
}

a lot more than this:

string QUX(string param1, string op, string param2){
     return mixin(X!(q{
          1  2  3
     },"__"~param1, op, param2~"__"));
}

In an ideal world, I think the macro could be defined like this (using 
the new anonymous function syntax on a named function):

string QUX(string param1, string op, string param2) => X!q{
      ("__"~param1)  (op)  (param2~"__");
};

and expanded like this:

mixin(QUX("foo","+","bar"));


I think what you have in mind is that macros are defined similar to this:



And then would be expanded like this

mixin(X!(QUX, "foo", "+", "bar"));


Is this better? I think it makes it more difficult to write and use such 
a macro, because there are no parameter names to document what the 
parameters are for.

Dec 30 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 12/30/11 3:51 PM, Timon Gehr wrote:
 On 12/30/2011 09:51 PM, Andrei Alexandrescu wrote:
 On 12/30/11 12:10 PM, Walter Bright wrote:
 On 12/30/2011 4:05 AM, Timon Gehr wrote:
 It certainly does. That is how all my code generation looks like. The
 fact that
 I am using string mixins to solve some problems shows that those are
 not
 'problems in D string mixins'.

 I your solution to parameterized strings is very nice. Can you write a
 brief article about it? This should be more widely known.

 The idea is good, but nonhygienic: the macro's expansion picks up
 symbols from the expansion context.

 What the template 'X' currently achieves is an improvement in syntax:

 string generated = "foo!\""~x~"\"(\""~bar(y)~"\")";

 vs

 string generated = mixin(X!q{
 foo!" (x)"(" (bar(y))")
 });

I understand that. But the whole system must be redesigned. Quoting from 
my email (please let's continue here so as to avoid duplication):

The macro facility should be very simple: any compile-time string can be 
a macro "body".

The key is the expansion facility, which replaces parameter placeholders 
(e.g. in the simplest instance $1, $2 etc) with actual parameters. This 
is missing. Also, there must be expansion of other already-defined macro 
names. This is already present.

The library has a simple interface:

enum myMacro = q{... $1 $2 $(anotherMacro($1))... };

// To mixin
mixin(expand(myMacro, "argument one", "argument two"));


Andrei

Dec 30 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 12/31/2011 12:02 AM, Andrei Alexandrescu wrote:
 On 12/30/11 3:51 PM, Timon Gehr wrote:
 On 12/30/2011 09:51 PM, Andrei Alexandrescu wrote:
 On 12/30/11 12:10 PM, Walter Bright wrote:
 On 12/30/2011 4:05 AM, Timon Gehr wrote:
 It certainly does. That is how all my code generation looks like. The
 fact that
 I am using string mixins to solve some problems shows that those are
 not
 'problems in D string mixins'.

 I your solution to parameterized strings is very nice. Can you write a
 brief article about it? This should be more widely known.

 The idea is good, but nonhygienic: the macro's expansion picks up
 symbols from the expansion context.

 What the template 'X' currently achieves is an improvement in syntax:

 string generated = "foo!\""~x~"\"(\""~bar(y)~"\")";

 vs

 string generated = mixin(X!q{
 foo!" (x)"(" (bar(y))")
 });

 I understand that. But the whole system must be redesigned. Quoting from
 my email (please let's continue here so as to avoid duplication):

 The macro facility should be very simple: any compile-time string can be
 a macro "body".

 The key is the expansion facility, which replaces parameter placeholders
 (e.g. in the simplest instance $1, $2 etc) with actual parameters. This
 is missing. Also, there must be expansion of other already-defined macro
 names. This is already present.

 The library has a simple interface:

 enum myMacro = q{... $1 $2 $(anotherMacro($1))... };

 // To mixin
 mixin(expand(myMacro, "argument one", "argument two"));


 Andrei

I understand, but compared to how I solved the issue

1. it invents an (arguably inferior) parameter passing system, even 
though there is one in the language.
2. it picks up all symbols used in $(...) from the caller's context 
rather than the callee's context and there is no way to get rid of that 
default, because the macro is unscoped.

Dec 30 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 12/30/2011 3:21 PM, Timon Gehr wrote:
 2. it picks up all symbols used in $(...) from the caller's context rather than
 the callee's context and there is no way to get rid of that default, because
the
 macro is unscoped.

That's characteristic of how macros work, and people want it that way. 
Otherwise, they'd use functions or templates.

Dec 30 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 12/31/2011 12:34 AM, Walter Bright wrote:
 On 12/30/2011 3:21 PM, Timon Gehr wrote:
 2. it picks up all symbols used in $(...) from the caller's context
 rather than
 the callee's context and there is no way to get rid of that default,
 because the
 macro is unscoped.

 That's characteristic of how macros work, and people want it that way.
 Otherwise, they'd use functions or templates.

I don't think that is true. It's an undesirable characteristic. I don't 
think it works well together with a module system. Note that I use 
arbitrary CTFE inside  (...). That can be correcting the case of an 
identifier or any implementation detail. I don't want to require any 
module that expands the macro to import those implementation details 
publicly. If I actually want to pick up identifiers from the caller's 
scope, that is easy: I just embed another X template instantiation.

Dec 30 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 12/30/11 5:21 PM, Timon Gehr wrote:
 On 12/31/2011 12:02 AM, Andrei Alexandrescu wrote:
 The library has a simple interface:

 enum myMacro = q{... $1 $2 $(anotherMacro($1))... };

 // To mixin
 mixin(expand(myMacro, "argument one", "argument two"));


 Andrei

 I understand, but compared to how I solved the issue

 1. it invents an (arguably inferior) parameter passing system, even
 though there is one in the language.
 2. it picks up all symbols used in $(...) from the caller's context
 rather than the callee's context and there is no way to get rid of that
 default, because the macro is unscoped.

Fair enough. I think your idea of defining a mini-macro-expansion system 
based on CTFE and strings is genius. I also think at the present the 
semantics are very unprincipled, and should not be popularized in any 
form lest D mixins acquire reputation similar to C macros. Finally, I 
think you have the resources to work your idea into a wonderful system 
that will be principled, practical, and extremely powerful.

Good luck!

Andrei

Dec 30 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 12/31/2011 01:10 AM, Andrei Alexandrescu wrote:
 On 12/30/11 5:21 PM, Timon Gehr wrote:
 On 12/31/2011 12:02 AM, Andrei Alexandrescu wrote:
 The library has a simple interface:

 enum myMacro = q{... $1 $2 $(anotherMacro($1))... };

 // To mixin
 mixin(expand(myMacro, "argument one", "argument two"));


 Andrei

 I understand, but compared to how I solved the issue

 1. it invents an (arguably inferior) parameter passing system, even
 though there is one in the language.
 2. it picks up all symbols used in $(...) from the caller's context
 rather than the callee's context and there is no way to get rid of that
 default, because the macro is unscoped.

 Fair enough. I think your idea of defining a mini-macro-expansion system
 based on CTFE and strings is genius. I also think at the present the
 semantics are very unprincipled, and should not be popularized in any
 form lest D mixins acquire reputation similar to C macros.

I think what you propose is a lot closer to C macros than what I already 
use. Therefore I don't understand what qualifies its semantics as 
unprincipled.

 Finally, I think you have the resources to work your idea into a wonderful
system
 that will be principled, practical, and extremely powerful.

 Good luck!

 Andrei

I'd be happy to extend the system, but currently I don't see it fall 
short any of the three requirements. Can you help me out?

Dec 30 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 12/30/11 6:25 PM, Timon Gehr wrote:
 I'd be happy to extend the system, but currently I don't see it fall
 short any of the three requirements. Can you help me out?

I think it would be great to reproduce the expansion semantics of ddoc.

Andrei

Dec 30 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 12/31/2011 03:16 AM, Andrei Alexandrescu wrote:
 On 12/30/11 6:25 PM, Timon Gehr wrote:
 I'd be happy to extend the system, but currently I don't see it fall
 short any of the three requirements. Can you help me out?

 I think it would be great to reproduce the expansion semantics of ddoc.

 Andrei

So basically, just breaking infinite recursion on recursive identical 
instantiations?

In what way does such a feature improve the expressiveness of the macro 
system?

Dec 30 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 12/30/2011 6:59 PM, Timon Gehr wrote:
 On 12/31/2011 03:16 AM, Andrei Alexandrescu wrote:
 On 12/30/11 6:25 PM, Timon Gehr wrote:
 I'd be happy to extend the system, but currently I don't see it fall
 short any of the three requirements. Can you help me out?

 I think it would be great to reproduce the expansion semantics of ddoc.

 Andrei

 So basically, just breaking infinite recursion on recursive identical
 instantiations?

 In what way does such a feature improve the expressiveness of the macro system?

Because inevitably someone will write:

#define FOO   a + FOO

and expect it to work (the correct expansion would be "a + FOO", not a stack 
overflow). The C preprocessor works this way, as do makefile macros, as Ddoc
does.

Dec 30 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 12/31/2011 04:50 AM, Walter Bright wrote:
 On 12/30/2011 6:59 PM, Timon Gehr wrote:
 On 12/31/2011 03:16 AM, Andrei Alexandrescu wrote:
 On 12/30/11 6:25 PM, Timon Gehr wrote:
 I'd be happy to extend the system, but currently I don't see it fall
 short any of the three requirements. Can you help me out?

 I think it would be great to reproduce the expansion semantics of ddoc.

 Andrei

 So basically, just breaking infinite recursion on recursive identical
 instantiations?

 In what way does such a feature improve the expressiveness of the
 macro system?

 Because inevitably someone will write:

 #define FOO a + FOO

 and expect it to work (the correct expansion would be "a + FOO", not a
 stack overflow). The C preprocessor works this way, as do makefile
 macros, as Ddoc does.

Makes sense, but why is it an issue if expansion is explicit?

enum FOO = q{a + FOO};

mixin(FOO);

Dec 30 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 12/30/2011 8:02 PM, Timon Gehr wrote:
 On 12/31/2011 04:50 AM, Walter Bright wrote:
 Because inevitably someone will write:

 #define FOO a + FOO

 and expect it to work (the correct expansion would be "a + FOO", not a
 stack overflow). The C preprocessor works this way, as do makefile
 macros, as Ddoc does.

 Makes sense, but why is it an issue if expansion is explicit?

 enum FOO = q{a + FOO};

 mixin(FOO);

Because the expanded text is then rescanned for further macro replacement.

Dec 30 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 12/31/2011 05:19 AM, Walter Bright wrote:
 On 12/30/2011 8:02 PM, Timon Gehr wrote:
 On 12/31/2011 04:50 AM, Walter Bright wrote:
 Because inevitably someone will write:

 #define FOO a + FOO

 and expect it to work (the correct expansion would be "a + FOO", not a
 stack overflow). The C preprocessor works this way, as do makefile
 macros, as Ddoc does.

 Makes sense, but why is it an issue if expansion is explicit?

 enum FOO = q{a + FOO};

 mixin(FOO);

 Because the expanded text is then rescanned for further macro replacement.

Yes, but in q{a + FOO} there is none.

Dec 30 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 12/30/2011 8:34 PM, Timon Gehr wrote:
 On 12/31/2011 05:19 AM, Walter Bright wrote:
 On 12/30/2011 8:02 PM, Timon Gehr wrote:
 On 12/31/2011 04:50 AM, Walter Bright wrote:
 Because inevitably someone will write:

 #define FOO a + FOO

 and expect it to work (the correct expansion would be "a + FOO", not a
 stack overflow). The C preprocessor works this way, as do makefile
 macros, as Ddoc does.

 Makes sense, but why is it an issue if expansion is explicit?

 enum FOO = q{a + FOO};

 mixin(FOO);

 Because the expanded text is then rescanned for further macro replacement.

 Yes, but in q{a + FOO} there is none.

#define FOO a+Foo
FOO;

What is the text after macro expansion?

Dec 30 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 12/30/2011 10:50 PM, Walter Bright wrote:
 On 12/30/2011 8:34 PM, Timon Gehr wrote:
 On 12/31/2011 05:19 AM, Walter Bright wrote:
 On 12/30/2011 8:02 PM, Timon Gehr wrote:
 On 12/31/2011 04:50 AM, Walter Bright wrote:
 Because inevitably someone will write:

 #define FOO a + FOO

 and expect it to work (the correct expansion would be "a + FOO", not a
 stack overflow). The C preprocessor works this way, as do makefile
 macros, as Ddoc does.

 Makes sense, but why is it an issue if expansion is explicit?

 enum FOO = q{a + FOO};

 mixin(FOO);

 Because the expanded text is then rescanned for further macro replacement.

 Yes, but in q{a + FOO} there is none.

 #define FOO a+Foo
 FOO;

 What is the text after macro expansion?

Blast, I meant

#define FOO a+FOO
FOO;

Dec 30 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 12/31/2011 07:50 AM, Walter Bright wrote:
 On 12/30/2011 10:50 PM, Walter Bright wrote:
 On 12/30/2011 8:34 PM, Timon Gehr wrote:
 On 12/31/2011 05:19 AM, Walter Bright wrote:
 On 12/30/2011 8:02 PM, Timon Gehr wrote:
 On 12/31/2011 04:50 AM, Walter Bright wrote:
 Because inevitably someone will write:

 #define FOO a + FOO

 and expect it to work (the correct expansion would be "a + FOO",
 not a
 stack overflow). The C preprocessor works this way, as do makefile
 macros, as Ddoc does.

 Makes sense, but why is it an issue if expansion is explicit?

 enum FOO = q{a + FOO};

 mixin(FOO);

 Because the expanded text is then rescanned for further macro
 replacement.

 Yes, but in q{a + FOO} there is none.

 #define FOO a+Foo
 FOO;

 What is the text after macro expansion?

 Blast, I meant

 #define FOO a+FOO
 FOO;

FOO; -> a + FOO;

mixin(FOO~";"); -> a + FOO;

The two do the same thing.

Dec 31 2011

"Nick Sabalausky" <a a.a> writes:

"Timon Gehr" <timon.gehr gmx.ch> wrote in message 
news:jdlbpq$2b7e$1 digitalmars.com...
 What the template 'X' currently achieves is an improvement in syntax:

 string generated = "foo!\""~x~"\"(\""~bar(y)~"\")";

Ewww, who in the world uses double-quote strings for code containing quotes? 
That's not a fair comparison. This is a better comparison:

string generated = `foo!"`~x~`"("`~bar(y)~`")`;

vs

string generated = mixin(X!q{
   foo!" (x)"(" (bar(y))")
});

Jan 02 2012

Timon Gehr <timon.gehr gmx.ch> writes:

On 01/02/2012 11:07 PM, Nick Sabalausky wrote:
 "Timon Gehr"<timon.gehr gmx.ch>  wrote in message
 news:jdlbpq$2b7e$1 digitalmars.com...
 What the template 'X' currently achieves is an improvement in syntax:

 string generated = "foo!\""~x~"\"(\""~bar(y)~"\")";

 Ewww, who in the world uses double-quote strings for code containing quotes?
 That's not a fair comparison. This is a better comparison:

 string generated = `foo!"`~x~`"("`~bar(y)~`")`;

 vs

 string generated = mixin(X!q{
     foo!" (x)"(" (bar(y))")
 });

What if the code contains both " and `? Using `` strings for code that 
contains quotes is not a general solution.

Jan 02 2012

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Monday, January 02, 2012 23:19:19 Timon Gehr wrote:
 What if the code contains both " and `? Using `` strings for code that
 contains quotes is not a general solution.

True. But it's a solution that works most of the time.

- Jonathan M Davis

Jan 02 2012

Timon Gehr <timon.gehr gmx.ch> writes:

On 12/30/2011 07:10 PM, Walter Bright wrote:
 On 12/30/2011 4:05 AM, Timon Gehr wrote:
 It certainly does. That is how all my code generation looks like. The
 fact that
 I am using string mixins to solve some problems shows that those are not
 'problems in D string mixins'.

 I your solution to parameterized strings is very nice. Can you write a
 brief article about it? This should be more widely known.

Ok.

Dec 30 2011

Manu <turkeyman gmail.com> writes:

This conversation has meandered into one very specific branch, but I just
want to add my 2c to the OP.
I agree, I want D to be a useful systems language too. These are my issues
to that end:

 * __forceinline ... I wasn't aware this didn't exist... and yes, despite
all this discussion, I still depend on this all the time. People are
talking about implementing forceinline by immitating macros using mixins...
crazy? Here's a solid reason I avoid mixins or procedurally generated code
(and the preprocessor in C for that matter, in favour of __forceinline):
YOU CAN DEBUG IT. In an inline function, the code exists in the source
file, just like any other function, you can STEP THE DEBUGGER through it,
and inspect the values easily. This is an underrated requirement. I would
waste hours on many days if I couldn't do this. I would only ever use
string mixins for the most obscure uses, preferring inline functions for
the sake of debugging 99% of the time.

 * vector type ... D has exactly no way to tell the compiler to allocate
128bit vector registers, load/store, and pass then to/from functions. That
is MOST of the register memory on virtually every modern processor, and D
can't address it... wtf?

 * inline assembler needs pseudo registers ... The inline assembler is
pretty crap, immitating C which is out-dated. Registers in assembly code
should barely ever be addressed directly, they should only be addressed by
TYPE, allowing the compiler to allocate available registers (and/or manage
storing the the stack where required) as with any other code. Inline
assembly without pseudo-registers is almost always an un-optimisation, and
this is also the reason why almost all C programmers use hardware opcode
intrinsics instead of inline assembly. There is no way without using
intrinsics in C to allow the compiler to perform optimal register
allocation, and this is still true for D, and in my opinion, just plain
broken.

 * __restrict ... I've said this before, but not being able to hint that
the compiler ignore possible pointer aliasing is a big performance problem,
especially when interacting with C libs.

 * multiple return values (in registers) ... (just because I opened a topic
about it before) This saves memory accesses in common cases where i want to
return (x, y), or (retVal, errorCode) for instance.

Walter made an argument "The same goes for all those language extensions
you mentioned. Those are not part of Standard C. They are vendor
extensions. Does that mean that C is not actually a systems language? No."
This is absurd... are you saying that you expect Iain to add these things
to GDC to that people can use them, and then create incompatible D code
with the 'standard' compiler?
Why would you intentionally fragment the compiler support of language
features rather than just making trivial (but important) features that
people do use part of the language?

This is a great example of why C is shit, and a good example of why I'm
interested in D at all...

On 29 December 2011 13:19, Vladimir Panteleev
<vladimir thecybershadow.net>wrote:

 On Thursday, 29 December 2011 at 09:16:23 UTC, Walter Bright wrote:

 Are you a ridiculous hacker? Inline x86 assembly that the compiler
 actually understands in 32 AND 64 bit code, hex string literals like x"DE
 ADB EEF" where spacing doesn't matter, the ability to set data alignment
 cross-platform with type.alignof = 16, load your shellcode verbatim into a
 string like so: auto str = import("shellcode.txt");

 I would like to talk about this for a bit. Personally, I think D's system
 programming abilities are only half-way there. Note that I am not talking
 about use cases in high-level application code, but rather low-level,
 widely-used framework code, where every bit of performance matters (for
 example: memory copy routines, string builders, garbage collectors).

 In-line assembler as part of the language is certainly neat, and in fact
 coming from Delphi to C++ I was surprised to learn that C++ implementations
 adopted different syntax for asm blocks. However, compared to some C++
 compilers, it has severe limitations and is D's only trick in this alley.

 For one thing, there is no way to force the compiler to inline a function
 (like __forceinline / __attribute((always_inline)) ). This is fine for
 high-level code (where users are best left with PGO and "the compiler knows
 best"), but sucks if you need a guarantee that the function must be
 inlined. The guarantee isn't just about inlining heuristics, but also
 implementation capabilities. For example, some implementations might not be
 able to inline functions that use certain language features, and your
 code's performance could demand that such a short function must be inlined.
 One example of this is inlining functions containing asm blocks - IIRC DMD
 does not support this. The compiler should fail the build if it can't
 inline a function tagged with  forceinline, instead of shrugging it off and
 failing silently, forcing users to check the disassembly every time.

 You may have noticed that GCC has some ridiculously complicated assembler
 facilities. However, they also open the way to the possibilities of writing
 optimal code - for example, creating custom calling conventions, or
 inlining assembler functions without restricting the caller's register
 allocation with a predetermined calling convention. In contrast, DMD is
 very conservative when it comes to mixing D and assembler. One time I found
 that putting an asm block in a function turned what were single
 instructions into blocks of 6 instructions each.

 D's lacking in this area makes it impossible to create language features
 that are on the level of D's compiler built-ins. For example, I have tested
 three memcpy implementations recently, but none of them could beat DMD's
 standard array slice copy (despite that in release mode it compiles to a
 simple memcpy call). Why? Because the overhead of using a custom memcpy
 routine negated its performance gains.

 This might have been alleviated with the presence of sane macros, but no
 such luck. String mixins are not the answer: trying to translate
 macro-heavy C code to D using string mixins is string escape hell, and
 we're back to the level of shell scripts.

 We've discussed this topic on IRC recently. From what I understood, Andrei
 thinks improvements in this area are not "impactful" enough, which I find
 worrisome.

 Personally, I don't think D qualifies as a true "system programming
 language" in light of the above. It's more of a compiled language with
 pointers and assembler. Before you disagree with any of the above, first
 (for starters) I'd like to invite you to translate Daniel Vik's C memcpy
 implementation to D: http://www.danielvik.com/2010/**
 02/fast-memcpy-in-c.html<http://www.danielvik.com/2010/02/fas
-memcpy-in-c.html>. It doesn't even use inline assembler or compiler intrinsics.

Jan 04 2012

bearophile <bearophileHUGS lycos.com> writes:

Manu:

  * vector type ... D has exactly no way to tell the compiler to allocate
 128bit vector registers, load/store, and pass then to/from functions. That
 is MOST of the register memory on virtually every modern processor, and D
 can't address it... wtf?

Currently the built-in vector operations of D are not optimized, their syntax
and semantics has some small holes that I'd like to see fixed (it's not just a
matter of implementation bugs, I also mean design bugs). So I suggest first to
improve them a lot, and only later, if necessary, to introduce intrinsics.

Bye,
bearophile

Jan 04 2012

Manu <turkeyman gmail.com> writes:

 Manu:

  * vector type ... D has exactly no way to tell the compiler to allocate
 128bit vector registers, load/store, and pass then to/from functions.

 That
 is MOST of the register memory on virtually every modern processor, and D
 can't address it... wtf?

 Currently the built-in vector operations of D are not optimized, their
 syntax and semantics has some small holes that I'd like to see fixed (it's
 not just a matter of implementation bugs, I also mean design bugs). So I
 suggest first to improve them a lot, and only later, if necessary, to
 introduce intrinsics.

I'm not referring to vector OPERATIONS. I only refer to the creation of a
type to identify these registers... anything more than that can be done
with inline asm, hardware intrinsics, etc, but the language MUST at least
expose the type to allow register allocation and parameter passing.

A language defined 128bit SIMD type would be fine for basically all
architectures. Even though they support different operations on these
registers, the size and allocation patterns are always the same across all
architectures; 128 bits, 16byte aligned, etc. This allows at minimum
platform independent expression of structures containing simd data, and
calling of functions passing these types as args.

SSE, VMX (PPC), VFP (ARM)... they all share the same rules.

Jan 04 2012

bearophile <bearophileHUGS lycos.com> writes:

Manu:

 I'm not referring to vector OPERATIONS. I only refer to the creation of a
 type to identify these registers...

Please, try to step back a bit and look at this problem from a bit more
distance. D has vector operations, and so far they have received only a tiny
amount of love. Are you able to find some ways to solve some of your problems
using a hypothetical much better implementation of D vector operations? Please,
think about the possibilities of this syntax.

Think about future CPU evolution with SIMD registers 128, then 256, then 512,
then 1024 bits long. In theory a good compiler is able to use them with no
changes in the D code that uses vector operations.

Intrinsics are an additive change, adding them later is possible. But I think
fixing the syntax of vector ops is more important. I have some bug reports in
Bugzilla about vector ops that are sleeping there since two years or so, and
they are not about implementation performance.

I think the good Hara will be able to implement those syntax fixes in a matter
of just one day or very few days if a consensus is reached about what actually
is to be fixed in D vector ops syntax.

Instead of discussing about *adding* something (register intrinsics) I suggest
to discuss about what to fix about the *already present* vector op syntax. This
is not a request to just you Manu, but to this whole newsgroup.

Bye,
bearophile

Jan 04 2012

Peter Alexander <peter.alexander.au gmail.com> writes:

On 5/01/12 12:42 AM, bearophile wrote:
 Manu:

 I'm not referring to vector OPERATIONS. I only refer to the creation of a
 type to identify these registers...

 Please, try to step back a bit and look at this problem from a bit more
distance. D has vector operations, and so far they have received only a tiny
amount of love. Are you able to find some ways to solve some of your problems
using a hypothetical much better implementation of D vector operations? Please,
think about the possibilities of this syntax.

 Think about future CPU evolution with SIMD registers 128, then 256, then 512,
then 1024 bits long. In theory a good compiler is able to use them with no
changes in the D code that uses vector operations.

 Intrinsics are an additive change, adding them later is possible. But I think
fixing the syntax of vector ops is more important. I have some bug reports in
Bugzilla about vector ops that are sleeping there since two years or so, and
they are not about implementation performance.

 I think the good Hara will be able to implement those syntax fixes in a matter
of just one day or very few days if a consensus is reached about what actually
is to be fixed in D vector ops syntax.

 Instead of discussing about *adding* something (register intrinsics) I suggest
to discuss about what to fix about the *already present* vector op syntax. This
is not a request to just you Manu, but to this whole newsgroup.

 Bye,
 bearophile

D has no alignment support, so there is no way to specify that you want 
a float[4] to be aligned on 16-bytes, which means there is no way for 
the compiler to generate code to exploit SSE well. It has to be 
conservative and assume unaligned.

Suppose alignment support is added:

alias align(16) float[4] vec4f;

vec4f a, b;
...
a[0] = a[3];
a[1] = a[2];
a[2] = b[0];
a[3] = b[1];

Is it reasonable to expect compilers to generate a single shuffle 
instruction from this? What about more complicated code like computing a 
dot product. What D code do I write to get the compiler to generate the 
expected machine code?

If we get alignment support and lots of work goes into optimizing vector 
ops for this then we can go a long with without intrinsics, but I don't 
think we'll ever be able to completely remove the need for intrinsics.

Jan 04 2012

Manu <turkeyman gmail.com> writes:

On 5 January 2012 02:42, bearophile <bearophileHUGS lycos.com> wrote:

 Manu:

 I'm not referring to vector OPERATIONS. I only refer to the creation of a
 type to identify these registers...

 Please, try to step back a bit and look at this problem from a bit more
 distance. D has vector operations, and so far they have received only a
 tiny amount of love. Are you able to find some ways to solve some of your
 problems using a hypothetical much better implementation of D vector
 operations? Please, think about the possibilities of this syntax.

 Think about future CPU evolution with SIMD registers 128, then 256, then
 512, then 1024 bits long. In theory a good compiler is able to use them
 with no changes in the D code that uses vector operations.

These are all fundamentally different types, like int and long.. float and
double... and I certainly want a keyword to identify each of them. Even if
the compiler is trying to make auto vector optimisations, you can't deny
programmers explicit control to the hardware when they want/need it.
Look at x86 compilers, been TRYING to perform automatic SSE optimisations
for 10 years, with basically no success... do you really think you can do
better then all that work by microsoft and GCC?
In my experience, I've even run into a lot of VC's auto-SSE-ed code that is
SLOWER than the original float code.
Let's not even mention architectures that receive much less love than x86,
and are arguably more important (ARM; slower, simpler processors with more
demand to perform well, and not waste power)

Also, D is NOT a good compiler, it's a rubbish compiler with respect to
code generation. And with a community so small, it has no hope of becoming
a 'good' compiler any time soon.. Even C/C++ compilers that have been
around for decades used by millions have been promising optimisations that
are still not available, and the ones that are come at the expense of
decades of smart engineers on huge paycheques.


 Intrinsics are an additive change, adding them later is possible. But I
 think fixing the syntax of vector ops is more important. I have some bug
 reports in Bugzilla about vector ops that are sleeping there since two
 years or so, and they are not about implementation performance.

Vector ops and SIMD ops are different things. float[4] (or more
realistically, float[3]) should NOT be a candidate for automatic SIMD
implementation, likewise, simd_type should not have its components
individually accessible. These are operations the hardware can not actually
perform. So no syntax to worry about, just a type.


 I think the good Hara will be able to implement those syntax fixes in a
 matter of just one day or very few days if a consensus is reached about
 what actually is to be fixed in D vector ops syntax.


 Instead of discussing about *adding* something (register intrinsics) I
 suggest to discuss about what to fix about the *already present* vector op
 syntax. This is not a request to just you Manu, but to this whole newsgroup.

And I think this is exactly the wrong approach. A vector is NOT an array of
4 (actually, usually 3) floats. It should not appear as one. This is overly
complicated and ultimately wrong way to engage this hardware.
Imagine the complexity in the compiler to try and force float[4] operations
into vector arithmetic vs adding a 'v128' type which actually does what
people want anyway...

SIMD units are not float units, they should not appear like an aggregation
of float units. They have:
 * Different error semantics, exception handling rules, sometimes different
precision...
 * Special alignment rules.
 * Special literal expression/assignment.
 * You can NOT access individual components at will.
 * May be reinterpreted at any time as float[1] float[4] double[2] short[8]
char[16], etc... (up to the architecture intrinsics)
 * Can not be involved in conventional comparison logic (array of floats
would make you think they could)
 *** Can NOT interact with the regular 'float' unit... Vectors as an array
of floats certainly suggests that you can interact with scalar floats...

I will use architecture intrinsics to operate on these regs, and put that
nice and neatly behind a hardware vector type with version()'s for each
architecture, and an API with a whole lot of sugar to make them nice and
friendly to use.

My argument is that even IF the compiler some day attempts to make vector
optimisations to float[4] arrays, the raw hardware should be exposed first,
and allow programmers to use it directly. This starts with a language
defined (platform independant) v128 type.

Jan 05 2012

Timon Gehr <timon.gehr gmx.ch> writes:

On 01/05/2012 10:02 AM, Manu wrote:
...
 Also, D is NOT a good compiler, it's a rubbish compiler with respect to
 code generation.
 [snip.]

D is not a compiler, it is a language. Furthermore it is not true that 
DMDs backend is rubbish and there are already more backends than just 
the DMC backend.

DMD: DMC backend, _fast code gen_ and very pleasant to use for debug builds.

GDC: GCC backend, optimizes as well as GCC. This is what I use for 
release. Takes about three times longer for a debug build than DMD. A 
lot less usable for edit-compile-test cycle than DMD.

LDC: LLVM backend, implements some additional optimizations in the front 
end. I don't have LDC installed, but iirc it is also a lot slower than DMD.

I think it would be nice if you stopped spreading FUD. You seem to have 
reasonable requests.

Jan 05 2012

Manu <turkeyman gmail.com> writes:

 D is not a compiler, it is a language. Furthermore it is not true that
 DMDs backend is rubbish and there are already more backends than just the
 DMC backend.

Sorry, I was generalising a little general in that claim. And where I say
'rubbish', I was drawing comparison to the maturity of C compilers for x86,
which STILL have trouble and make lots of mistakes with centuries of man
hours of work.
DMD has inferior code gen (there was a post last night comparing some
disassemblies of trivial programs with GDC), will probably never rival GCC,
that's fine, it's a reference, I get that.
But to say that using GDC will magically fix code gen is also false. I'm
not familiar with the GCC code, so I may be wrong, but my understanding is
that there is frontend work, and frontend-GCC glue work that will allow for
back end optimisation (which GCC can do quite well) to work properly. This
is still a lot of work for a small OSS team.
I also wonder if the D language provides some opportunities for
optimisation that aren't expressible in other languages, and therefore may
not already have an expression in the GCC back end... so I can imagine some
of future optimisations frequently discussed in this forum won't just
magically appear with GCC/LLVM maturity. I can't imagine Iain and co
extending the GCC back end to support some obscure D optimisations
happening any time soon.

The point I was making (which you seem to have missed thanks to my
inflamatory comment ;), was that I don't have faith that compiler maturity
will solve all these problems in prompt time, and even if they do for x86,
what about for less common architectures that receive a lot less love (as
is also the case in C)?

I make this argument in support of the language expressing optimal
constructs with ease and by default, rather than expressing some concept
that feels nice to programmers, but puts a burden on the
whole-program-optimiser to fix.
For example, virtual-by-default RELIES on whole-program-optimisation to
fix, whereas final by default has no performance implications, and will
produce the best code automatically.

I think it would be nice if you stopped spreading FUD. You seem to have
 reasonable requests.

Perhaps a fair request, but I only do this because after a couple of months
now, I have a good measure of FUD, and I receive very little response to
anything I've raised that would make me feel otherwise.
The most distressing thing to me is the pattern I see where most of my more
trivial (but still significant) points are outright dismissed, and the hard
ones are ignored, rather than reasonably argued and putting my FUD to rest.
:)

I also accept that I produce very little evidence to support any of my
claims, so I'm easy to ignore, but this is because all my work and
experience is commercial, private, and I can't easily extract anything
without wasting a lot of work time to present it... not to mention breaking
NDA's. Most problem cases aren't trivial, require a large context to prove
with benchmarks. I can't easily write a few lines and say "here you go".
At some level I'd like to think people would accept the word of a seasoned
game engine dev who's genuinely interested in adopting the language for
that sort of work, but I completely understand those who are skeptical. ;)

Jan 05 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/5/12 10:49 AM, Manu wrote:
 I make this argument in support of the language expressing optimal
 constructs with ease and by default, rather than expressing some concept
 that feels nice to programmers, but puts a burden on the
 whole-program-optimiser to fix.
 For example, virtual-by-default RELIES on whole-program-optimisation to
 fix, whereas final by default has no performance implications, and will
 produce the best code automatically.

Your point is well meaning. I trust you understood and internalized your 
options outside a language change: using final or private in interfaces 
and classes, using struct instead of class, switching design to static 
polymorphism etc. Our assessment is that these work very well, promote 
good class hierarchy design, require reasonably little work from the 
programmer, and do not need advanced compiler optimizations.

The D programming language is stabilizing. Making a change of such a 
magnitude is not negotiable, and moreover we believe the current design 
is very good in that regard so we are twice as motivated to keep it.

At this point you need to evaluate whether you can live with this 
annoyance or forgo use of the language.


Thanks,

Andrei

Jan 05 2012

Manu <turkeyman gmail.com> writes:

On 5 January 2012 19:35, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org>wrote:

 On 1/5/12 10:49 AM, Manu wrote:

 I make this argument in support of the language expressing optimal
 constructs with ease and by default, rather than expressing some concept
 that feels nice to programmers, but puts a burden on the
 whole-program-optimiser to fix.
 For example, virtual-by-default RELIES on whole-program-optimisation to
 fix, whereas final by default has no performance implications, and will
 produce the best code automatically.

 Your point is well meaning. I trust you understood and internalized your
 options outside a language change: using final or private in interfaces and
 classes, using struct instead of class, switching design to static
 polymorphism etc. Our assessment is that these work very well, promote good
 class hierarchy design, require reasonably little work from the programmer,
 and do not need advanced compiler optimizations.

 The D programming language is stabilizing. Making a change of such a
 magnitude is not negotiable, and moreover we believe the current design is
 very good in that regard so we are twice as motivated to keep it.

 At this point you need to evaluate whether you can live with this
 annoyance or forgo use of the language.

I do realise all the implementation details you suggest. My core point is
that this is dangerous. A new and/or junior programmer is not likely to
know all that.. they will most probably type 'class', and then start typing
methods. Why would they do anything else? Every other language I can think
of trains them to do that.
The D catch phrase 'the right thing is the easiest thing to do' doesn't
seem to hold up here... although that does depend on your point of view on
'right', for which I've made my argument numerous times, I'll desist from
here on ;)

I also realise this issue is non-negotiable. I said that in a previous
email, and I'm prepared to live with it (although I wonder if a compiler
option would be possible?)... As said, I still felt it was important to
raise this one for conversations sake, and to make sure this point of view
towards issues like this are more seriously considered in future.

That said, this is just one of numerous issues myself and the OP raised. I
don't know why this one became the most popular for discussion... my
suspicion is that is because this is the easiest of my complaints to
dismiss and shut down ;)
This is also the least interesting to me personally of the issues I and the
OP raised (knowing it can't be changed)... I'd rather be discussing the
others ;)

Jan 05 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 1/5/2012 10:02 AM, Manu wrote:
 That said, this is just one of numerous issues myself and the OP raised. I
don't
 know why this one became the most popular for discussion... my suspicion is
that
 is because this is the easiest of my complaints to dismiss and shut down ;)

That's a common phenomena, known as bikeshedding. Issues that are easy to 
understand, everyone will weigh in. The hard ones require an investment of 
effort to understand, and few will do it.

Jan 05 2012

Sean Kelly <sean invisibleduck.org> writes:

On Jan 5, 2012, at 10:02 AM, Manu wrote:
=20
 That said, this is just one of numerous issues myself and the OP =

raised. I don't know why this one became the most popular for =
discussion... my suspicion is that is because this is the easiest of my =
complaints to dismiss and shut down ;)

It's also about the only language change among the issues you mentioned. =
 Most of the others are QOI issues for compiler vendors.  What I've been =
curious about is if you really have a need for the performance that =
would be granted by these features, or if this is more of an idealistic =
issue.

Jan 05 2012

Peter Alexander <peter.alexander.au gmail.com> writes:

On 5/01/12 7:41 PM, Sean Kelly wrote:
 On Jan 5, 2012, at 10:02 AM, Manu wrote:
 That said, this is just one of numerous issues myself and the OP raised. I
don't know why this one became the most popular for discussion... my suspicion
is that is because this is the easiest of my complaints to dismiss and shut
down ;)

 It's also about the only language change among the issues you mentioned.  Most
of the others are QOI issues for compiler vendors.  What I've been curious
about is if you really have a need for the performance that would be granted by
these features, or if this is more of an idealistic issue.

It's not idealistic. For example, in my current project, I have a 3x 
perf improvement by rewriting that function with a few hundred lines of 
inline asm, purely to use SIMD instructions.

This is a nuisance because:

(a) It's hard to maintain. I have to thoroughly document what registers 
I'm using for what just so that I don't forget.

(b) Difficult to optimize further. I could optimize the inline assembly 
further by doing better scheduling of instructions, but instruction 
scheduling naturally messes up the organization of your code, which 
makes it a maintenance nightmare.

(c) It's not cross platform. Luckily x86/x86_64 are similar enough that 
I can write the code once and patch up the differences with CTFE + 
string mixins.

I know other parts of my code that would benefit from SIMD, but it's too 
much hassle to write and maintain inline assembly.

If we had support for

align(16) float[4] a, b;
a[] += b[]; // addps on x86

Then that would solve a lot of problems, but only solves the problem 
when you are doing "float-like" operations (addition, multiplication 
etc.) There's no obvious existing syntax for doing things like shuffles, 
conversions, SIMD square roots, cache control etc. that would naturally 
match to SIMD instructions.

Also, there's no way to tell the compiler whether you want to treat a 
float[4] as an array or a vector. Vectors are suited for data parallel 
execution whereas array are suited for indexing. If the compiler makes 
the wrong decision then you suffer heavily.

Ideally, we'd introduce vector types, e.g. vec_float4, vec_int4, 
vec_double2 etc.

These would naturally match to vector registers on CPUs and be aligned 
appropriately for the target platform.

Elementary operations would match naturally and generate the code you 
expect. Shuffling and other non-elementary operations would require the 
use of intrinsics.

// 4 vector norms in parallel
vec_float4 xs, ys, zs, ws;
vec_float4 lengths = vec_sqrt(xs * xs + ys * ys + zs * zs + ws * ws);

On x86 w/SSE, this would ideally generate:

// assuming xs in xmm0, ys in xmm1 etc.
mulps xmm0, xmm0;
mulps xmm1, xmm1;
addps xmm0, xmm1;
mulps xmm2, xmm2;
addps xmm0, xmm2;
mulps xmm3, xmm3;
addps xmm0, xmm3;
sqrtps xmm0, xmm0;

On platforms that don't support the vector types natively, there's two 
options (1) compile error, (2) compile, replacing them with float ops.

I think this is the only sensible way forward.

Jan 05 2012

Manu <turkeyman gmail.com> writes:

On 5 January 2012 21:41, Sean Kelly <sean invisibleduck.org> wrote:

 On Jan 5, 2012, at 10:02 AM, Manu wrote:
 That said, this is just one of numerous issues myself and the OP raised.

 I don't know why this one became the most popular for discussion... my
 suspicion is that is because this is the easiest of my complaints to
 dismiss and shut down ;)

 It's also about the only language change among the issues you mentioned.
  Most of the others are QOI issues for compiler vendors.  What I've been
 curious about is if you really have a need for the performance that would
 be granted by these features, or if this is more of an idealistic issue.

I think they are *all* language requests. They could be implemented by 3rd
party compilers, but these things are certainly nice to standardise in the
language, or you end up with C, which is a mess I'm trying to escape.

Of the topics been discussed:
  * vector type - I can't be expected to write a game without using the
vector hardware... I'd rather use a portable standardised type than a GDC
extension. Why the resistance? I haven't heard any good arguments against,
other than some murmurings about float[4] as the official syntax, which I
gave detailed arguments against.

  * inline asm supporting pseudo regs - As an extension of using the vector
hardware, I'll need inline asm, and inline assembly without pseudo regs is
pretty useless... it's mandatory that the compiler shedule the register
allocation otherwise inline asm will most likely be an un-optimisation. If
D had pseudo regs in its inline assembler, it would make it REALLY
attractive for embedded systems programmers.
    In lieu of that, I need to use opcode intrinsics instead, which I
believe GDC exposes, but again, I'm done with C and versioning (#ifdef-ing)
every compiler I intend to use. Why not standardise these things? At least
put the intrinsics in the standard lib...

  * __restrict - Not a deal breaker, but console game dev industry uses
this all the time. There are countless articles on the topic (many are
private or on console vendor forums). If this is not standardised in the
language, GDC will still expose it I'm sure, fragmenting the language.

  * __forceinline - I'd rather have a proper keyword than using tricks like
mixins and stuff as have been discussed. The reason for this is debugging.
Code is not-inlined in debug builds, looks&feels like normal code, can
still evaluate, and step like regular code... and guarantee that it inlines
properly when optimised. This just saves time; no frills debugging.

  * multiple return values - This is a purely idealistic feature request,
but would lead to some really nice optimisations while retaining very tidy
code if implemented properly. Other languages support this, it's a nice
modern feature... why not have it in D?

I couldn't release the games I do without efficient use of vector hardware
and __restrict used in appropriate places. I use them every day, and I
wouldn't begin a project in D without knowing that these features are
supported, or are definitely coming to the language.
On fixed hardware systems, the bar is high and it's very competitive... you
can't waste processor time.
Trust me when I say that using __restrict appropriately might lead to twice
as many particles on screen, or allowing more physics bodies, or more
accurate simulation. SIMD hardware is mandatory, and will usually increase
performance 2-5 times in my experience. The type of code that usually
benefits the most ranges from particle simulation, collision/physics,
procedural geometry/texturing, and funnily enough, memcopy ;) .. stock
memcopy doesn't take advantage of 16byte simd registers for copying memory,
Ill bet D doesn't either.
A little while back I rewrote a bitplane compositor (raw binary munging,
not a typical vector hardware job) in VMX. Very tricky, but it was around
10 times faster... which is good, because our game had to hold rock solid
60fps, and it saved the build and even allowed us to add some more nice
features ;)

If D can't compete with, or beat C, it won't be used in this market on high
end products, though perhaps still viable on smaller/not-cutting-edge
projects if productivity is considered more important.
Engine programmers are thoroughly aware of code generation, and in C,
tricks/techniques to coerce the compiler to generate the code you want to
see are common place... and often very, very ugly. I think the industry
would be very impressed and enthusiastic if D were able to generate the
best possible code with conventional and elegant language semantics,
without annoying tricks or proprietary compiler extensions to do so.

Jan 05 2012

Manu <turkeyman gmail.com> writes:

On 5 January 2012 21:41, Sean Kelly <sean invisibleduck.org> wrote:

 On Jan 5, 2012, at 10:02 AM, Manu wrote:
 That said, this is just one of numerous issues myself and the OP raised.

 I don't know why this one became the most popular for discussion... my
 suspicion is that is because this is the easiest of my complaints to
 dismiss and shut down ;)

 It's also about the only language change among the issues you mentioned.
  Most of the others are QOI issues for compiler vendors.  What I've been
 curious about is if you really have a need for the performance that would
 be granted by these features, or if this is more of an idealistic issue.

I think they are *all* language requests. They could be implemented by 3rd
party compilers, but these things are certainly nice to standardise in the
language, or you end up with C, which is a mess I'm trying to escape.

Of the topics been discussed:
  * vector type - I can't be expected to write a game without using the
vector hardware... I'd rather use a portable standardised type than a GDC
extension. Why the resistance? I haven't heard any good arguments against,
other than some murmurings about float[4] as the official syntax, which I
gave detailed arguments against.

  * inline asm supporting pseudo regs - As an extension of using the vector
hardware, I'll need inline asm, and inline assembly without pseudo regs is
pretty useless... it's mandatory that the compiler shedule the register
allocation otherwise inline asm will most likely be an un-optimisation. If
D had pseudo regs in its inline assembler, it would make it REALLY
attractive for embedded systems programmers.
    In lieu of that, I need to use opcode intrinsics instead, which I
believe GDC exposes, but again, I'm done with C and versioning (#ifdef-ing)
every compiler I intend to use. Why not standardise these things? At least
put the intrinsics in the standard lib...

  * __restrict - Not a deal breaker, but console game dev industry uses
this all the time. There are countless articles on the topic (many are
private or on console vendor forums). If this is not standardised in the
language, GDC will still expose it I'm sure, fragmenting the language.

  * __forceinline - I'd rather have a proper keyword than using tricks like
mixins and stuff as have been discussed. The reason for this is debugging.
Code is not-inlined in debug builds, looks&feels like normal code, can
still evaluate, and step like regular code... and guarantee that it inlines
properly when optimised. This just saves time; no frills debugging.

  * multiple return values - This is a purely idealistic feature request,
but would lead to some really nice optimisations while retaining very tidy
code if implemented properly. Other languages support this, it's a nice
modern feature... why not have it in D?

I couldn't release the games I do without efficient use of vector hardware
and __restrict used in appropriate places. I use them every day, and I
wouldn't begin a project in D without knowing that these features are
supported, or are definitely coming to the language.
On fixed hardware systems, the bar is high and it's very competitive... you
can't waste processor time.
Trust me when I say that using __restrict appropriately might lead to twice
as many particles on screen, or allowing more physics bodies, or more
accurate simulation. SIMD hardware is mandatory, and will usually increase
performance 2-5 times in my experience. The type of code that usually
benefits the most ranges from particle simulation, collision/physics,
procedural geometry/texturing, and funnily enough, memcopy ;) .. stock
memcopy doesn't take advantage of 16byte simd registers for copying memory,
Ill bet D doesn't either.
A little while back I rewrote a bitplane compositor (raw binary munging,
not a typical vector hardware job) in VMX. Very tricky, but it was around
10 times faster... which is good, because our game had to hold rock solid
60fps, and it saved the build and even allowed us to add some more nice
features ;)

If D can't compete with, or beat C, it won't be used in this market on high
end products, though perhaps still viable on smaller/not-cutting-edge
projects if productivity is considered more important.
Engine programmers are thoroughly aware of code generation, and in C,
tricks/techniques to coerce the compiler to generate the code you want to
see are common place... and often very, very ugly. I think the industry
would be very impressed and enthusiastic if D were able to generate the
best possible code with conventional and elegant language semantics,
without annoying tricks or proprietary compiler extensions to do so.

Jan 05 2012

Sean Kelly <sean invisibleduck.org> writes:

On Jan 5, 2012, at 12:36 PM, Manu wrote:

 On 5 January 2012 21:41, Sean Kelly <sean invisibleduck.org> wrote:
 On Jan 5, 2012, at 10:02 AM, Manu wrote:
 That said, this is just one of numerous issues myself and the OP =


raised. I don't know why this one became the most popular for =
discussion... my suspicion is that is because this is the easiest of my =
complaints to dismiss and shut down ;)
=20
 It's also about the only language change among the issues you =

mentioned.  Most of the others are QOI issues for compiler vendors.  =
What I've been curious about is if you really have a need for the =
performance that would be granted by these features, or if this is more =
of an idealistic issue.
=20
 I think they are all language requests. They could be implemented by =

3rd party compilers, but these things are certainly nice to standardise =
in the language, or you end up with C, which is a mess I'm trying to =
escape.

It's a grey area I suppose.  Half of the features you list don't change =
the language, they simply allow the compiler to make optimizations it =
otherwise couldn't.  I suppose the D way would be to make them ' ' =
prefixed and provide some means for having the compiler ignore them if =
it didn't recognize them.  This wouldn't fragment the language so much =
as make generated code more efficient on supporting platforms.


 Of the topics been discussed:
   * vector type - I can't be expected to write a game without using =

the vector hardware... I'd rather use a portable standardised type than =
a GDC extension. Why the resistance? I haven't heard any good arguments =
against, other than some murmurings about float[4] as the official =
syntax, which I gave detailed arguments against.

Could a vector type be defined in the library?  Aside from alignment, =
there doesn't seem to be anything that requires compiler support.  Or am =
I missing something?


   * inline asm supporting pseudo regs - As an extension of using the =

vector hardware, I'll need inline asm, and inline assembly without =
pseudo regs is pretty useless... it's mandatory that the compiler =
shedule the register allocation otherwise inline asm will most likely be =
an un-optimisation. If D had pseudo regs in its inline assembler, it =
would make it REALLY attractive for embedded systems programmers.

This would certainly be nice.  When I drop into ASM I generally could =
care less about which actual register I use.  I just want to call =
something specific.


     In lieu of that, I need to use opcode intrinsics instead, which I =

believe GDC exposes, but again, I'm done with C and versioning =
(#ifdef-ing) every compiler I intend to use. Why not standardise these =
things? At least put the intrinsics in the standard lib...
=20
   * __restrict - Not a deal breaker, but console game dev industry =

uses this all the time. There are countless articles on the topic (many =
are private or on console vendor forums). If this is not standardised in =
the language, GDC will still expose it I'm sure, fragmenting the =
language.
=20
   * __forceinline - I'd rather have a proper keyword than using tricks =

like mixins and stuff as have been discussed. The reason for this is =
debugging. Code is not-inlined in debug builds, looks&feels like normal =
code, can still evaluate, and step like regular code... and guarantee =
that it inlines properly when optimised. This just saves time; no frills =
debugging.

I'd say these are QOI issues, as above.


   * multiple return values - This is a purely idealistic feature =

request, but would lead to some really nice optimisations while =
retaining very tidy code if implemented properly. Other languages =
support this, it's a nice modern feature... why not have it in D?

I can see the ABI rules for this getting really complicated, much like =
how parameter passing rules on x64 are insanely complex compared to x32. =
 But I agree that it would be a nice feature to have.


 I couldn't release the games I do without efficient use of vector =

hardware and __restrict used in appropriate places. I use them every =
day, and I wouldn't begin a project in D without knowing that these =
features are supported, or are definitely coming to the language.

I know it's a major time commitment, but the best way to realize any new =
feature quickly is to create a pull request.  Feature proposals have a =
way of being lost if they never extend beyond this newsgroup.=

Jan 05 2012

Manu <turkeyman gmail.com> writes:

On 5 January 2012 23:03, Sean Kelly <sean invisibleduck.org> wrote:

 I think they are all language requests. They could be implemented by 3rd

 party compilers, but these things are certainly nice to standardise in the
 language, or you end up with C, which is a mess I'm trying to escape.

 It's a grey area I suppose.  Half of the features you list don't change
 the language, they simply allow the compiler to make optimizations it
 otherwise couldn't.  I suppose the D way would be to make them ' ' prefixed
 and provide some means for having the compiler ignore them if it didn't
 recognize them.  This wouldn't fragment the language so much as make
 generated code more efficient on supporting platforms.

Precisely. That's all I want :) .. formal definition of these concepts,
allowing GDC and co to feed that information through to the backend which
already supports these concepts :)


 Could a vector type be defined in the library?  Aside from alignment,
 there doesn't seem to be anything that requires compiler support.  Or am I
 missing something?

I think there's a lot you're missing. Load/store patterns, register
allocation, parameter argument convention (ABI details), literals and
assignment, exception handling/error conditions, and alignment... The
actual working functional stuff could/would be done in a library, but even
then, if the opcode intrinsic names weren't standardised, it'd be an awful
mess behind the scenes aggregating all the different names/terminology from
each compiler implementation.

   * inline asm supporting pseudo regs - As an extension of using the
 vector hardware, I'll need inline asm, and inline assembly without pseudo
 regs is pretty useless... it's mandatory that the compiler shedule the
 register allocation otherwise inline asm will most likely be an
 un-optimisation. If D had pseudo regs in its inline assembler, it would
 make it REALLY attractive for embedded systems programmers.

 This would certainly be nice.  When I drop into ASM I generally could care
 less about which actual register I use.  I just want to call something
 specific.

It's usually actually disruptive to the program around the inline asm block
if you are naming registers explicitly.. better to let the compiler assign
them, and intelligently flush values to the stack if it runs out of
registers.
AT&T asm syntax allows this, you define 'parameter' types using some silly
string, and then refer to %1, %2, %3 in place of registers in your asm
code, and it will perform the register assignment.
Most C compilers also don't allow program optimisation/rescheduling around
inline asm blocks, this makes them useless, and I'll bet GDC suffers the
same problem right now(...?)

   * __restrict - Not a deal breaker, but console game dev industry uses
 this all the time. There are countless articles on the topic (many are
 private or on console vendor forums). If this is not standardised in the
 language, GDC will still expose it I'm sure, fragmenting the language.
   * __forceinline - I'd rather have a proper keyword than using tricks

 like mixins and stuff as have been discussed. The reason for this is
 debugging. Code is not-inlined in debug builds, looks&feels like normal
 code, can still evaluate, and step like regular code... and guarantee that
 it inlines properly when optimised. This just saves time; no frills
 debugging.

 I'd say these are QOI issues, as above.

Yup, so just standardise the names for these attributes and pass the info
through to GCC's back end... done :) ... Although I'm sure __forceinline
isn't so simple.


   * multiple return values - This is a purely idealistic feature

 request, but would lead to some really nice optimisations while retaining
 very tidy code if implemented properly. Other languages support this, it's
 a nice modern feature... why not have it in D?

 I can see the ABI rules for this getting really complicated, much like how
 parameter passing rules on x64 are insanely complex compared to x32.  But I
 agree that it would be a nice feature to have.

How so? Parameters are passed in a sequence of regs of the appropriate
type, to a point, at which stage they get put on the stack... is x64
somehow more complicated than that?
Multiple return values would use the exact same regs in reverse. There
should be no side effects, the calling function has already (or has the
caoability to) stored off any save regs in order to pass args in the first
place.

 I couldn't release the games I do without efficient use of vector
 hardware and __restrict used in appropriate places. I use them every day,
 and I wouldn't begin a project in D without knowing that these features are
 supported, or are definitely coming to the language.

 I know it's a major time commitment, but the best way to realize any new
 feature quickly is to create a pull request.  Feature proposals have a way
 of being lost if they never extend beyond this newsgroup.


Fair call, but I don't have time to get involved in that level right now,
not by a long shot.
I'm just a potential customer trying to give it a fair go at this point...
;)

Jan 05 2012

Iain Buclaw <ibuclaw ubuntu.com> writes:

On 5 January 2012 20:36, Manu <turkeyman gmail.com> wrote:
 On 5 January 2012 21:41, Sean Kelly <sean invisibleduck.org> wrote:
 On Jan 5, 2012, at 10:02 AM, Manu wrote:
 That said, this is just one of numerous issues myself and the OP raise=



d.
 I don't know why this one became the most popular for discussion... my
 suspicion is that is because this is the easiest of my complaints to d=



ismiss
 and shut down ;)

 It's also about the only language change among the issues you mentioned.
 =A0Most of the others are QOI issues for compiler vendors. =A0What I've =


been
 curious about is if you really have a need for the performance that woul=


d be
 granted by these features, or if this is more of an idealistic issue.


 I think they are all language requests. They could be implemented by 3rd
 party compilers, but these things are certainly nice to standardise in th=

e
 language, or you end up with C, which is a mess I'm trying to escape.

 Of the topics been discussed:
 =A0 * vector type - I can't be expected to write a game without using the
 vector hardware... I'd rather use a portable standardised type than a GDC
 extension. Why the resistance? I haven't heard any good arguments against=

,
 other than some=A0murmurings=A0about float[4] as the official syntax, whi=

ch I
 gave detailed arguments against.

I dabbled with vector types, none of the vector builtins are hashed
out to GDC because it really does require that vector be a unique type
from a normal array.

 =A0 * inline asm supporting pseudo regs - As an extension of using the ve=

ctor
 hardware, I'll need inline asm, and inline assembly without pseudo regs i=

s
 pretty useless... it's mandatory that the compiler shedule the register
 allocation otherwise inline asm will most likely be an un-optimisation. I=

f D
 had pseudo regs in its inline assembler, it would make it REALLY attracti=

ve
 for embedded systems programmers.
 =A0 =A0 In lieu of that, I need to use opcode intrinsics instead, which I
 believe GDC exposes, but again, I'm done with C and versioning (#ifdef-in=

g)
 every compiler I intend to use. Why not standardise these things? At leas=

t
 put the intrinsics in the standard lib...

This is only possible using GDC extended asm - which is really GCC asm
but encapsulated in {} instead of ();

 =A0 * __restrict - Not a deal breaker, but console game dev industry uses=

 this
 all the time. There are countless articles on the topic (many are private=

 or
 on console vendor forums). If this is not standardised in the language, G=

DC
 will still expose it I'm sure, fragmenting the language.

I don't think D enforces any sort of aliasing rules, but it would be
nice to turn on strict aliasing though...

 =A0 * __forceinline - I'd rather have a proper keyword than using tricks =

like
 mixins and stuff as have been discussed. The reason for this is debugging=

.
 Code is not-inlined in debug builds, looks&feels like normal code, can st=

ill
 evaluate, and step like regular code... and guarantee that it inlines
 properly when optimised. This just saves time; no frills debugging.

__forceinline still won't be a guarantee though.


--=20
Iain Buclaw

*(p < e ? p++ : p) =3D (c & 0x0f) + '0';

Jan 05 2012

Sean Kelly <sean invisibleduck.org> writes:

On Jan 5, 2012, at 1:42 PM, Manu wrote:

 Most C compilers also don't allow program optimisation/rescheduling =

around inline asm blocks, this makes them useless, and I'll bet GDC =
suffers the same problem right now(=85?)

Not sure about GDC.  At one time, I pushed for keeping "volatile" alive =
so asm blocks could be labeled to tell the compiler not to optimize =
across them.  I recall Walter rejecting the idea because compilers =
shouldn't optimize across asm blocks.  This should probably be revisited =
at some point.


   * __restrict - Not a deal breaker, but console game dev industry =


uses this all the time. There are countless articles on the topic (many =
are private or on console vendor forums). If this is not standardised in =
the language, GDC will still expose it I'm sure, fragmenting the =
language.
   * __forceinline - I'd rather have a proper keyword than using =


tricks like mixins and stuff as have been discussed. The reason for this =
is debugging. Code is not-inlined in debug builds, looks&feels like =
normal code, can still evaluate, and step like regular code... and =
guarantee that it inlines properly when optimised. This just saves time; =
no frills debugging.
=20
 I'd say these are QOI issues, as above.
=20
 Yup, so just standardise the names for these attributes and pass the =

info through to GCC's back end... done :) ... Although I'm sure =
__forceinline isn't so simple.

I'm sure it isn't.  Though as long as compilers aren't required to =
support it, it's just a matter of making it work, right: ;-)

=20
   * multiple return values - This is a purely idealistic feature =


request, but would lead to some really nice optimisations while =
retaining very tidy code if implemented properly. Other languages =
support this, it's a nice modern feature... why not have it in D?
=20
 I can see the ABI rules for this getting really complicated, much like =

how parameter passing rules on x64 are insanely complex compared to x32. =
 But I agree that it would be a nice feature to have.
=20
 How so? Parameters are passed in a sequence of regs of the appropriate =

type, to a point, at which stage they get put on the stack... is x64 =
somehow more complicated than that?
 Multiple return values would use the exact same regs in reverse. There =

should be no side effects, the calling function has already (or has the =
caoability to) stored off any save regs in order to pass args in the =
first place.

No, that's exactly how x64 works.  But compared to x32 where everything =
is simply pushed onto the stack=85  That's all I was saying.


 Fair call, but I don't have time to get involved in that level right =

now, not by a long shot.
 I'm just a potential customer trying to give it a fair go at this =

point... ;)

Then please be as specific as you can :-)  Don might have some =
experience in this area, but I suspect Walter does not.=

Jan 05 2012

Manu <turkeyman gmail.com> writes:

On 5 January 2012 23:50, Iain Buclaw <ibuclaw ubuntu.com> wrote:

   * inline asm supporting pseudo regs - As an extension of using the

 vector
 hardware, I'll need inline asm, and inline assembly without pseudo regs

 is
 pretty useless... it's mandatory that the compiler shedule the register
 allocation otherwise inline asm will most likely be an un-optimisation.

 If D
 had pseudo regs in its inline assembler, it would make it REALLY

 attractive
 for embedded systems programmers.
     In lieu of that, I need to use opcode intrinsics instead, which I
 believe GDC exposes, but again, I'm done with C and versioning

 (#ifdef-ing)
 every compiler I intend to use. Why not standardise these things? At

 least
 put the intrinsics in the standard lib...

 This is only possible using GDC extended asm - which is really GCC asm
 but encapsulated in {} instead of ();

... shit.
I fear this is a VERY serious problem that needs discussion and resolution.
Now we have 2 competing standards of asm syntax in D...
We're exactly in the same place as VisualC and GCC now. Epic fail.

Jan 05 2012

Manu <turkeyman gmail.com> writes:

On 5 January 2012 23:53, Sean Kelly <sean invisibleduck.org> wrote:

 I recall Walter rejecting the idea because compilers shouldn't optimize
 across asm blocks.  This should probably be revisited at some point.

It's proven then, inline asm blocks break the optimiser... they are
officially useless. This is why all C coders use intrinsics these days, and
D should too.


 How so? Parameters are passed in a sequence of regs of the appropriate

 type, to a point, at which stage they get put on the stack... is x64
 somehow more complicated than that?
 Multiple return values would use the exact same regs in reverse. There

 should be no side effects, the calling function has already (or has the
 caoability to) stored off any save regs in order to pass args in the firs=

t
 place.

 No, that's exactly how x64 works.  But compared to x32 where everything i=

s
 simply pushed onto the stack=E2=80=A6  That's all I was saying.

Ah yes, I forgot... x86 is such a shit architecture! ;) .. I rarely write
x86 code, it's fairly pointless usually. Chips don't execute the opcodes
you write anyway, they reinterpret and microcode them.. you can never know
what's best, and it's different for every x86 processor/vendor :P
Despite the assembly you read, x86 doesn't REALLY push all those args to
the stack, they have much larger register banks internally, and use them...
finally, x64 put an end to the nostalgic x86 nonsense :)

Jan 05 2012

Iain Buclaw <ibuclaw ubuntu.com> writes:

On 5 January 2012 22:01, Manu <turkeyman gmail.com> wrote:
 On 5 January 2012 23:50, Iain Buclaw <ibuclaw ubuntu.com> wrote:
 =A0 * inline asm supporting pseudo regs - As an extension of using the
 vector
 hardware, I'll need inline asm, and inline assembly without pseudo reg=



s
 is
 pretty useless... it's mandatory that the compiler shedule the registe=



r
 allocation otherwise inline asm will most likely be an un-optimisation=



.
 If D
 had pseudo regs in its inline assembler, it would make it REALLY
 attractive
 for embedded systems programmers.
 =A0 =A0 In lieu of that, I need to use opcode intrinsics instead, whic=



h I
 believe GDC exposes, but again, I'm done with C and versioning
 (#ifdef-ing)
 every compiler I intend to use. Why not standardise these things? At
 least
 put the intrinsics in the standard lib...

 This is only possible using GDC extended asm - which is really GCC asm
 but encapsulated in {} instead of ();


 ... shit.
 I fear this is a VERY serious problem that needs discussion and resolutio=

n.
 Now we have 2 competing standards of asm syntax in D...
 We're exactly in the same place as VisualC and GCC now. Epic fail.

Why?

The reasoning behind is more so that you can write asm statements on
all architectures, not just x86. And with GDC being a frontend of GCC,
seems a natural thing to support (this has actually been in GDC since
2004, so I'm not sure why you should through all arms up about it
now).


--=20
Iain Buclaw

*(p < e ? p++ : p) =3D (c & 0x0f) + '0';

Jan 05 2012

Artur Skawina <art.08.09 gmail.com> writes:

On 01/05/12 22:50, Iain Buclaw wrote:
 I don't think D enforces any sort of aliasing rules, but it would be
 nice to turn on strict aliasing though...

-fstrict-aliasing is already turned on by default in gdc...

artur

Jan 05 2012

Manu <turkeyman gmail.com> writes:

On 6 January 2012 00:10, Iain Buclaw <ibuclaw ubuntu.com> wrote:

 The reasoning behind is more so that you can write asm statements on
 all architectures, not just x86. And with GDC being a frontend of GCC,
 seems a natural thing to support (this has actually been in GDC since
 2004, so I'm not sure why you should through all arms up about it
 now).


When I was first reading about D I read that the inline assembler syntax is
built in and standardised in the language... and I gave a large sigh of
relief.
If that's not the case, there are competing asm syntax in D, well... that
sucks. Am I version-ing my asm blocks for DMD and GDC now like I have to in
C for VC and GCC?
Surely D should settle on just one... If that happens to be the GCC syntax
for compatibility, great...?

Jan 05 2012

Iain Buclaw <ibuclaw ubuntu.com> writes:

On 5 January 2012 22:16, Manu <turkeyman gmail.com> wrote:
 On 6 January 2012 00:10, Iain Buclaw <ibuclaw ubuntu.com> wrote:
 The reasoning behind is more so that you can write asm statements on
 all architectures, not just x86. And with GDC being a frontend of GCC,
 seems a natural thing to support (this has actually been in GDC since
 2004, so I'm not sure why you should through all arms up about it
 now).


 When I was first reading about D I read that the inline assembler syntax is
 built in and standardised in the language... and I gave a large sigh of
 relief.
 If that's not the case, there are competing asm syntax in D, well... that
 sucks. Am I version-ing my asm blocks for DMD and GDC now like I have to in
 C for VC and GCC?
 Surely D should settle on just one... If that happens to be the GCC syntax
 for compatibility, great...?

For all its intentions, I think D-style syntax is great, however for
GDC, all lines need to be translated into GCC-equivalent syntax when
emitting AST.

Example - what ARM assembly would look like in D:

asm {
    cmp R0, R1;
    blt Lbmax;
    mov R2, R0;
    b Lrest;
Lbmax:
    mov R2, R1;
Lrest:
}

In order to compile *correctly*, we must be able to tell GCC what are
outputs, what are inputs, what are labels, and what gets clobbered.
The backend needs to know this to ensure syntax is correct, and it
doesn't try to do anything odd that may invalidate what you are trying
to do.  ie:  Output operands, the compiler can check this that outputs
are lvalue.  Clobbers, tells the backend that a register is not free
to use as a place to store temporary values.  Labels, tells the
backend that this asm block of code could jmp to a given location,
meaning it should be protected from the usual dead code elimination
passes.

For this to work, requires the frontend to be *aware* of what assembly
language it is compiling, and be able to parse it, understand it
correctly.  Which would not be the most pleasant of things to
implement granted the number of support architectures.  Converting x86
Intel syntax assembly to x86 GCC syntax assembly is enough for me. :)


-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';

Jan 05 2012

Sean Kelly <sean invisibleduck.org> writes:

On Jan 5, 2012, at 2:08 PM, Manu wrote:

 On 5 January 2012 23:53, Sean Kelly <sean invisibleduck.org> wrote:
 I recall Walter rejecting the idea because compilers shouldn't =

optimize across asm blocks.  This should probably be revisited at some =
point.
=20
 It's proven then, inline asm blocks break the optimiser... they are =

officially useless. This is why all C coders use intrinsics these days, =
and D should too.

For the record, some compilers do optimize across asm blocks.  It's =
simply DMD/DMC that doesn't.  Though the lack of "volatile" makes doing =
this unsafe in D as a general rule.=

Jan 05 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 1/5/2012 2:57 PM, Sean Kelly wrote:
 For the record, some compilers do optimize across asm blocks.  It's simply
 DMD/DMC that doesn't.  Though the lack of "volatile" makes doing this unsafe
 in D as a general rule.

dmd does keep track of register usage within asm blocks.

Jan 05 2012

Sean Kelly <sean invisibleduck.org> writes:

On Jan 5, 2012, at 3:56 PM, Walter Bright wrote:

 On 1/5/2012 2:57 PM, Sean Kelly wrote:
 For the record, some compilers do optimize across asm blocks.  It's =


simply
 DMD/DMC that doesn't.  Though the lack of "volatile" makes doing this =


unsafe
 in D as a general rule.

=20
 dmd does keep track of register usage within asm blocks.

Oh right, I guess it would have to, since variables can be used by name =
within asm blocks.  I guess it just doesn't do code movement across asm =
blocks then?=

Jan 05 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 1/5/2012 4:03 PM, Sean Kelly wrote:
 Oh right, I guess it would have to, since variables can be used by name
 within asm blocks.  I guess it just doesn't do code movement across asm
 blocks then?

Right. More generally, it does not do data flow analysis within an asm block, 
treating it as a black box that could do anything.

Jan 07 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 1/5/2012 8:49 AM, Manu wrote:
 I also wonder if the D language provides some opportunities for optimisation
 that aren't expressible in other languages,

There are some. One that is currently being exploited by the optimizer and back 
end is the existence of pure functions.

 and therefore may not already have
 an expression in the GCC back end... so I can imagine some of future
 optimisations frequently discussed in this forum won't just magically appear
 with GCC/LLVM maturity. I can't imagine Iain and co extending the GCC back end
 to support some obscure D optimisations happening any time soon.

Right.


 At some level I'd like to think people would accept the word of a seasoned game
 engine dev who's genuinely interested in adopting the language for that sort of
 work, but I completely understand those who are skeptical. ;)

I'm interested in hearing more. (The virtual thing can't change.)

Jan 05 2012

Manu <turkeyman gmail.com> writes:

On 5 January 2012 23:30, Walter Bright <newshound2 digitalmars.com> wrote:

 On 1/5/2012 8:49 AM, Manu wrote:

 I also wonder if the D language provides some opportunities for
 optimisation
 that aren't expressible in other languages,

 There are some. One that is currently being exploited by the optimizer and
 back end is the existence of pure functions.


Does GDC currently support these same optimisations, or is this a DMD
special power?

Jan 05 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 1/5/2012 1:43 PM, Manu wrote:
 On 5 January 2012 23:30, Walter Bright <newshound2 digitalmars.com
 <mailto:newshound2 digitalmars.com>> wrote:

     On 1/5/2012 8:49 AM, Manu wrote:

         I also wonder if the D language provides some opportunities for
optimisation
         that aren't expressible in other languages,


     There are some. One that is currently being exploited by the optimizer and
     back end is the existence of pure functions.


 Does GDC currently support these same optimisations, or is this a DMD special
power?

I don't know what GDC does.

Jan 05 2012

Iain Buclaw <ibuclaw ubuntu.com> writes:

On 6 January 2012 00:03, Walter Bright <newshound2 digitalmars.com> wrote:
 On 1/5/2012 1:43 PM, Manu wrote:
 On 5 January 2012 23:30, Walter Bright <newshound2 digitalmars.com

 <mailto:newshound2 digitalmars.com>> wrote:

 =A0 =A0On 1/5/2012 8:49 AM, Manu wrote:

 =A0 =A0 =A0 =A0I also wonder if the D language provides some opportuniti=


es for
 optimisation
 =A0 =A0 =A0 =A0that aren't expressible in other languages,


 =A0 =A0There are some. One that is currently being exploited by the opti=


mizer
 and
 =A0 =A0back end is the existence of pure functions.


 Does GDC currently support these same optimisations, or is this a DMD
 special power?


 I don't know what GDC does.

GDC ties D optimisations to function attributes of GCC, so you'll have
do you some reading up on the meaning. :)

There are three levels of purity, these are matched to are const, pure
and novops attributes.


--=20
Iain Buclaw

*(p < e ? p++ : p) =3D (c & 0x0f) + '0';

Jan 05 2012

Iain Buclaw <ibuclaw ubuntu.com> writes:

On 5 January 2012 16:49, Manu <turkeyman gmail.com> wrote:
 D is not a compiler, it is a language. Furthermore it is not true that
 DMDs backend is rubbish and there are already more backends than just th=


e
 DMC backend.


 Sorry, I was generalising a little general in that claim.=A0And where I s=

ay
 'rubbish', I was drawing comparison to the maturity of C compilers for x8=

6,
 which STILL have trouble and make lots of mistakes with centuries of man
 hours of work.
 DMD has inferior code gen (there was a post last night comparing some
 disassemblies of trivial programs with GDC), will probably never rival GC=

C,
 that's fine, it's a reference, I get that.
 But to say that using GDC will magically fix code gen is also false. I'm =

not
 familiar with the GCC code, so I may be wrong, but my understanding is th=

at
 there is frontend work, and frontend-GCC glue work that will allow for ba=

ck
 end optimisation (which GCC can do quite well) to work properly. This is
 still a lot of work for a small OSS team.
 I also wonder if the D language provides some opportunities for optimisat=

ion
 that aren't expressible in other languages, and therefore may not already
 have an expression in the GCC back end... so I can imagine some of future
 optimisations frequently discussed in this forum won't just magically app=

ear
 with GCC/LLVM maturity. I can't imagine Iain and co extending the GCC bac=

k
 end to support some obscure D optimisations happening any time soon.

Actually, it's just me. ;)

So far I have come across no D optimisations that aren't supported in
GCC.  Infact, most of the time I find myself thinking of how I can use
obscure GCC optimisation X to improve D.    One example is an
interesting feature of Fortran, though written with C++ in mind. Seems
like something that could be right up D's street.

http://www.digitalmars.com/webnews/newsgroups.php?art_group=3Ddigitalmars.D=
&article_id=3D147822

--=20
Iain Buclaw

*(p < e ? p++ : p) =3D (c & 0x0f) + '0';

Jan 05 2012

Manu <turkeyman gmail.com> writes:

So regarding my assumptions about translating the D front end expressions
to GCC? Is that all simpler than I imagine?
Do you think GDC generates optimal code comparable to C code?

What about pure functions, can you make good on optimisations like caching
results of pure functions, moving them outside loops, etc?

On 6 January 2012 00:03, Iain Buclaw <ibuclaw ubuntu.com> wrote:

 On 5 January 2012 16:49, Manu <turkeyman gmail.com> wrote:
 D is not a compiler, it is a language. Furthermore it is not true that
 DMDs backend is rubbish and there are already more backends than just


 the
 DMC backend.


 Sorry, I was generalising a little general in that claim. And where I say
 'rubbish', I was drawing comparison to the maturity of C compilers for

 x86,
 which STILL have trouble and make lots of mistakes with centuries of man
 hours of work.
 DMD has inferior code gen (there was a post last night comparing some
 disassemblies of trivial programs with GDC), will probably never rival

 GCC,
 that's fine, it's a reference, I get that.
 But to say that using GDC will magically fix code gen is also false. I'm

 not
 familiar with the GCC code, so I may be wrong, but my understanding is

 that
 there is frontend work, and frontend-GCC glue work that will allow for

 back
 end optimisation (which GCC can do quite well) to work properly. This is
 still a lot of work for a small OSS team.
 I also wonder if the D language provides some opportunities for

 optimisation
 that aren't expressible in other languages, and therefore may not already
 have an expression in the GCC back end... so I can imagine some of future
 optimisations frequently discussed in this forum won't just magically

 appear
 with GCC/LLVM maturity. I can't imagine Iain and co extending the GCC

 back
 end to support some obscure D optimisations happening any time soon.

 Actually, it's just me. ;)

 So far I have come across no D optimisations that aren't supported in
 GCC.  Infact, most of the time I find myself thinking of how I can use
 obscure GCC optimisation X to improve D.    One example is an
 interesting feature of Fortran, though written with C++ in mind. Seems
 like something that could be right up D's street.


 http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=147822

 --
 Iain Buclaw

 *(p < e ? p++ : p) = (c & 0x0f) + '0';

Jan 05 2012

Iain Buclaw <ibuclaw ubuntu.com> writes:

On 5 January 2012 22:11, Manu <turkeyman gmail.com> wrote:
 So regarding my assumptions about translating the D front end expressions to
 GCC? Is that all simpler than I imagine?
 Do you think GDC generates optimal code comparable to C code?

 What about pure functions, can you make good on optimisations like caching
 results of pure functions, moving them outside loops, etc?

I think you are confusing the pure with memoization.  I could be wrong
however... :)

-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';

Jan 05 2012

Peter Alexander <peter.alexander.au gmail.com> writes:

On 5/01/12 10:17 PM, Iain Buclaw wrote:
 On 5 January 2012 22:11, Manu<turkeyman gmail.com>  wrote:
 So regarding my assumptions about translating the D front end expressions to
 GCC? Is that all simpler than I imagine?
 Do you think GDC generates optimal code comparable to C code?

 What about pure functions, can you make good on optimisations like caching
 results of pure functions, moving them outside loops, etc?

 I think you are confusing the pure with memoization.  I could be wrong
 however... :)

I think Manu is right:

void foo(int x)
{
     int[10] a;
     foreach (ref e; a)
         e = bar(x);
}

If bar is pure then you can safely transform this into:

void foo(int x)
{
     int[10] a;
     auto barx = bar(x);
     foreach (ref e; a)
         e = barx;
}

Jan 05 2012

bearophile <bearophileHUGS lycos.com> writes:

Peter Alexander:

 void foo(int x)
 {
      int[10] a;
      foreach (ref e; a)
          e = bar(x);
 }
 
 If bar is pure then you can safely transform this into:
 
 void foo(int x)
 {
      int[10] a;
      auto barx = bar(x);
      foreach (ref e; a)
          e = barx;
 }

If bar is pure, but it throws exceptions, the two versions of the code behave
differently, so it's a wrong optimization. You need bar to be pure nothrow.

Moving pure nothrow functions out of loops is an easy optimization, and even
simple D compilers are meant to do it. Aggressively optimizing D compilers are
also free to memoize some results of pure (and probably nothrow too) functions.

-------

Regarding the discussion about virtual functions, show me a D compiler able to
de-virtualize very well, as the Oracle JVM does :-)
Time ago I have asked LLVM devs to improve this situation for LDC, they have
now fixed most of my bug reports, so I think they will eventually fix this too
(maybe partially):
http://llvm.org/bugs/show_bug.cgi?id=3100

Bye,
bearophile

Jan 05 2012

Peter Alexander <peter.alexander.au gmail.com> writes:

On 5/01/12 10:17 PM, Iain Buclaw wrote:
 On 5 January 2012 22:11, Manu<turkeyman gmail.com>  wrote:
 So regarding my assumptions about translating the D front end expressions to
 GCC? Is that all simpler than I imagine?
 Do you think GDC generates optimal code comparable to C code?

 What about pure functions, can you make good on optimisations like caching
 results of pure functions, moving them outside loops, etc?

 I think you are confusing the pure with memoization.  I could be wrong
 however... :)

I think Manu is right:

void foo(int x)
{
     int[10] a;
     foreach (ref e; a)
         e = bar(x);
}

If bar is pure then you can safely transform this into:

void foo(int x)
{
     int[10] a;
     auto barx = bar(x);
     foreach (ref e; a)
         e = barx;
}

If bar is not pure then this transformation would be unsafe.

Jan 05 2012

Iain Buclaw <ibuclaw ubuntu.com> writes:

On 5 January 2012 23:40, Peter Alexander <peter.alexander.au gmail.com> wro=
te:
 On 5/01/12 10:17 PM, Iain Buclaw wrote:
 On 5 January 2012 22:11, Manu<turkeyman gmail.com> =A0wrote:
 So regarding my assumptions about translating the D front end expressio=



ns
 to

 GCC? Is that all simpler than I imagine?
 Do you think GDC generates optimal code comparable to C code?

 What about pure functions, can you make good on optimisations like
 caching
 results of pure functions, moving them outside loops, etc?

 I think you are confusing the pure with memoization. =A0I could be wrong
 however... :)

 I think Manu is right:

 void foo(int x)
 {
 =A0 =A0int[10] a;
 =A0 =A0foreach (ref e; a)
 =A0 =A0 =A0 =A0e =3D bar(x);
 }

 If bar is pure then you can safely transform this into:

 void foo(int x)
 {
 =A0 =A0int[10] a;
 =A0 =A0auto barx =3D bar(x);
 =A0 =A0foreach (ref e; a)
 =A0 =A0 =A0 =A0e =3D barx;
 }

 If bar is not pure then this transformation would be unsafe.

Yes, it will do something like that, though the loop will be unrolled
- and given that gdc supports vectorisation, I think that above
example will likely be vectorised too.  So off the top of my head:

void foo(int x)
{
   auto barx =3D bar(x);
   vector(4) vect =3D { barx, barx, barx, barx };
   *[&a] =3D vect;
   *[&a+16] =3D vect;
   a[9] =3D barx;
   a[10] =3D barx;
}



Regards
--=20
Iain Buclaw

*(p < e ? p++ : p) =3D (c & 0x0f) + '0';

Jan 05 2012

Manu <turkeyman gmail.com> writes:

On 6 January 2012 00:17, Iain Buclaw <ibuclaw ubuntu.com> wrote:

 On 5 January 2012 22:11, Manu <turkeyman gmail.com> wrote:
 So regarding my assumptions about translating the D front end

 expressions to
 GCC? Is that all simpler than I imagine?
 Do you think GDC generates optimal code comparable to C code?

 What about pure functions, can you make good on optimisations like

 caching
 results of pure functions, moving them outside loops, etc?

 I think you are confusing the pure with memoization.  I could be wrong
 however... :)


Umm, maybe... but I don't think so.
And I don't think you just answered either of my questions ;)

Jan 05 2012

Iain Buclaw <ibuclaw ubuntu.com> writes:

On 5 January 2012 22:22, Manu <turkeyman gmail.com> wrote:
 On 6 January 2012 00:17, Iain Buclaw <ibuclaw ubuntu.com> wrote:
 On 5 January 2012 22:11, Manu <turkeyman gmail.com> wrote:
 So regarding my assumptions about translating the D front end
 expressions to
 GCC? Is that all simpler than I imagine?
 Do you think GDC generates optimal code comparable to C code?

 What about pure functions, can you make good on optimisations like
 caching
 results of pure functions, moving them outside loops, etc?

 I think you are confusing the pure with memoization. =A0I could be wrong
 however... :)


 Umm, maybe... but I don't think so.
 And I don't think you just answered either of my questions ;)

What I meant was, pure attribute will only allow the compiler to
reduce the number of times the function is called in certain
circumstances. However it makes no guarantees that it will do so, only
if it think it's appropriate.

--=20
Iain Buclaw

*(p < e ? p++ : p) =3D (c & 0x0f) + '0';

Jan 05 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 1/5/2012 1:02 AM, Manu wrote:
 My argument is that even IF the compiler some day attempts to make vector
 optimisations to float[4] arrays, the raw hardware should be exposed first, and
 allow programmers to use it directly. This starts with a language defined
 (platform independant) v128 type.

Manu, I appreciate your expertise in this manner, which I lack. I think you've 
made a great case. Can you flesh this out with more specific suggestions on
what 
language changes would work best?

Jan 05 2012

Manu <turkeyman gmail.com> writes:

On 5 January 2012 22:47, Walter Bright <newshound2 digitalmars.com> wrote:

 On 1/5/2012 1:02 AM, Manu wrote:

 My argument is that even IF the compiler some day attempts to make vector
 optimisations to float[4] arrays, the raw hardware should be exposed
 first, and
 allow programmers to use it directly. This starts with a language defined
 (platform independant) v128 type.

 Manu, I appreciate your expertise in this manner, which I lack. I think
 you've made a great case. Can you flesh this out with more specific
 suggestions on what language changes would work best?

Love to. I'll give it some thorough thought. There's more details than I
think most would expect...

Jan 05 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 1/5/2012 1:22 PM, Manu wrote:
 Love to. I'll give it some thorough thought. There's more details than I think
 most would expect...

I know the devil is in the details :-)

Anyhow, please start a new topic for that one. This thread is getting too large.

Jan 05 2012

a <a a.com> writes:

 A language defined 128bit SIMD type would be fine for basically all
 architectures. Even though they support different operations on these
 registers, the size and allocation patterns are always the same across all
 architectures; 128 bits, 16byte aligned, etc. This allows at minimum
 platform independent expression of structures containing simd data, and
 calling of functions passing these types as args.

You forgot about AVX. It uses 256 bit registers and is supported in new Intel
and AMD processors.

Jan 05 2012

Manu <turkeyman gmail.com> writes:

On 5 January 2012 22:33, a <a a.com> wrote:

 A language defined 128bit SIMD type would be fine for basically all
 architectures. Even though they support different operations on these
 registers, the size and allocation patterns are always the same across

 all
 architectures; 128 bits, 16byte aligned, etc. This allows at minimum
 platform independent expression of structures containing simd data, and
 calling of functions passing these types as args.

 You forgot about AVX. It uses 256 bit registers and is supported in new
 Intel and AMD processors.

AVX is another type, a new 256bit type, as double is to float, and should
also have a keyword ;)

Jan 05 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/4/12 3:39 AM, Manu wrote:
   * __forceinline ... I wasn't aware this didn't exist... and yes,
 despite all this discussion, I still depend on this all the time. People
 are talking about implementing forceinline by immitating macros using
 mixins.... crazy? Here's a solid reason I avoid mixins or procedurally
 generated code (and the preprocessor in C for that matter, in favour of
 __forceinline): YOU CAN DEBUG IT. In an inline function, the code exists
 in the source file, just like any other function, you can STEP THE
 DEBUGGER through it, and inspect the values easily. This is an
 underrated requirement. I would waste hours on many days if I couldn't
 do this. I would only ever use string mixins for the most obscure uses,
 preferring inline functions for the sake of debugging 99% of the time.

Hmmm, I see that the other way around. D CTFE-generated macros are much 
easier to debug because you can actually print the code before mixing it 
in. If it looks like valid code... great.

I think the deal with inline functions is significantly more complex. 
Inlining is the first step in a long pipeline of optimizations that 
often make the code virtually unrecognizable and impossible to map back 
to source in a way that's remotely understandable.


Andrei

Jan 04 2012

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Wednesday, 4 January 2012 at 14:28:07 UTC, Andrei Alexandrescu 
wrote:
 Hmmm, I see that the other way around. D CTFE-generated macros 
 are much easier to debug because you can actually print the 
 code before mixing it in. If it looks like valid code... great.

Paging Don Clugston: would it be feasable to have the compiler 
remember the source position of every single char/string literal 
or compile-time-evaluated string expression?

I'm thinking that if the compiler tracks the source of every 
string/char literal in the source code, all across to any 
manipulations, debugging CTFE-generated code would be a lot 
easier - the compiler would emit error messages pointing inside 
string literals, and debuggers could step inside code in string 
literals. (The one thing this doesn't allow is allowing debuggers 
to step through a DSL with no D code in it.)

The naive implementation would store the position of every 
character, which would blow up the memory usage by about 13 times 
or so on 32-bit? (For every character, add a struct with 3 fields 
- char* filename; int line, column). A rope-like structure could 
cut down on that but possibly drastically complicating the 
implementation.

Jan 04 2012

Rainer Schuetze <r.sagitario gmx.de> writes:

On 04.01.2012 15:33, Vladimir Panteleev wrote:
 On Wednesday, 4 January 2012 at 14:28:07 UTC, Andrei Alexandrescu wrote:
 Hmmm, I see that the other way around. D CTFE-generated macros are
 much easier to debug because you can actually print the code before
 mixing it in. If it looks like valid code... great.

 Paging Don Clugston: would it be feasable to have the compiler remember
 the source position of every single char/string literal or
 compile-time-evaluated string expression?

 I'm thinking that if the compiler tracks the source of every string/char
 literal in the source code, all across to any manipulations, debugging
 CTFE-generated code would be a lot easier - the compiler would emit
 error messages pointing inside string literals, and debuggers could step
 inside code in string literals. (The one thing this doesn't allow is
 allowing debuggers to step through a DSL with no D code in it.)

 The naive implementation would store the position of every character,
 which would blow up the memory usage by about 13 times or so on 32-bit?
 (For every character, add a struct with 3 fields - char* filename; int
 line, column). A rope-like structure could cut down on that but possibly
 drastically complicating the implementation.

A simpler way to debug CTFE generated code is to dump it to another file 
and redirect the debugger to this file.
Here is a patch that does just that, but it is probably not up to date 
anymore: http://d.puremagic.com/issues/show_bug.cgi?id=5051#c4

Jan 04 2012

"Martin Nowak" <dawg dawgfoto.de> writes:

Am 04.01.2012, 15:33 Uhr, schrieb Vladimir Panteleev  
<vladimir thecybershadow.net>:

 On Wednesday, 4 January 2012 at 14:28:07 UTC, Andrei Alexandrescu wrote:
 Hmmm, I see that the other way around. D CTFE-generated macros are much  
 easier to debug because you can actually print the code before mixing  
 it in. If it looks like valid code... great.

 Paging Don Clugston: would it be feasable to have the compiler remember  
 the source position of every single char/string literal or  
 compile-time-evaluated string expression?

 I'm thinking that if the compiler tracks the source of every string/char  
 literal in the source code, all across to any manipulations, debugging  
 CTFE-generated code would be a lot easier - the compiler would emit  
 error messages pointing inside string literals, and debuggers could step  
 inside code in string literals. (The one thing this doesn't allow is  
 allowing debuggers to step through a DSL with no D code in it.)

 The naive implementation would store the position of every character,  
 which would blow up the memory usage by about 13 times or so on 32-bit?  
 (For every character, add a struct with 3 fields - char* filename; int  
 line, column). A rope-like structure could cut down on that but possibly  
 drastically complicating the implementation.

Last time I generated big and complex mixin I let the compiler
output mixins to separate files. This gives you nicer debugging
and readable error lines.
https://github.com/D-Programming-Language/dmd/pull/426

Jan 09 2012

Manu <turkeyman gmail.com> writes:

On 9 January 2012 12:01, Martin Nowak <dawg dawgfoto.de> wrote:

 Am 04.01.2012, 15:33 Uhr, schrieb Vladimir Panteleev <
 vladimir thecybershadow.net>:

  On Wednesday, 4 January 2012 at 14:28:07 UTC, Andrei Alexandrescu wrote:
 Hmmm, I see that the other way around. D CTFE-generated macros are much
 easier to debug because you can actually print the code before mixing it
 in. If it looks like valid code... great.

 Paging Don Clugston: would it be feasable to have the compiler remember
 the source position of every single char/string literal or
 compile-time-evaluated string expression?

 I'm thinking that if the compiler tracks the source of every string/char
 literal in the source code, all across to any manipulations, debugging
 CTFE-generated code would be a lot easier - the compiler would emit error
 messages pointing inside string literals, and debuggers could step inside
 code in string literals. (The one thing this doesn't allow is allowing
 debuggers to step through a DSL with no D code in it.)

 The naive implementation would store the position of every character,
 which would blow up the memory usage by about 13 times or so on 32-bit?
 (For every character, add a struct with 3 fields - char* filename; int
 line, column). A rope-like structure could cut down on that but possibly
 drastically complicating the implementation.

 Last time I generated big and complex mixin I let the compiler
 output mixins to separate files. This gives you nicer debugging
 and readable error lines.
 https://github.com/D-**Programming-Language/dmd/pull/**426<https://github.com/D-Programming-Language/dmd/pull/426>

Amazing idea, this should be standard!
That totally changes my feelings towards mixins :)

Jan 09 2012

Manu <turkeyman gmail.com> writes:

On 4 January 2012 16:28, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org>wrote:

 On 1/4/12 3:39 AM, Manu wrote:

  * __forceinline ... I wasn't aware this didn't exist... and yes,
 despite all this discussion, I still depend on this all the time. People
 are talking about implementing forceinline by immitating macros using
 mixins.... crazy? Here's a solid reason I avoid mixins or procedurally

 generated code (and the preprocessor in C for that matter, in favour of
 __forceinline): YOU CAN DEBUG IT. In an inline function, the code exists
 in the source file, just like any other function, you can STEP THE
 DEBUGGER through it, and inspect the values easily. This is an
 underrated requirement. I would waste hours on many days if I couldn't
 do this. I would only ever use string mixins for the most obscure uses,
 preferring inline functions for the sake of debugging 99% of the time.

 Hmmm, I see that the other way around. D CTFE-generated macros are much
 easier to debug because you can actually print the code before mixing it
 in. If it looks like valid code... great.

 I think the deal with inline functions is significantly more complex.
 Inlining is the first step in a long pipeline of optimizations that often
 make the code virtually unrecognizable and impossible to map back to source
 in a way that's remotely understandable.


It's rare to step through optimised code. You tend to debug and step in
debug/unoptimised builds, where inline functions are usually not even
inlined, and code flow still looks natural, and easy to follow.. This saves
lots of time.
C/C++ macros present the same problem of not being able to step and inspect
values. Most industry programmers I work with tend to avoid macros for this
reason above all others.

Jan 04 2012

Jerry <jlquinn optonline.net> writes:

Manu <turkeyman gmail.com> writes:
 It's rare to step through optimised code. You tend to debug and step in debug/
 unoptimised builds, where inline functions are usually not even inlined, and
 code flow still looks natural, and easy to follow.. This saves lots of time.
 C/C++ macros present the same problem of not being able to step and inspect
 values. Most industry programmers I work with tend to avoid macros for this
 reason above all others.

I do it all the time.  Normally I with optimized builds because our code
takes a long time to run, so if I can find the problem without going to
the debug build, I save a fair bit of time.

I find that you get used to the weirdnesses that show up when stepping
through optimized code.  I'm probably able to find problems I'm looking
for 60-70% of the time without resorting to using the debug build.

Jerry

Jan 05 2012

Artur Skawina <art.08.09 gmail.com> writes:

On 01/04/12 16:31, Artur Skawina wrote:
 Functions attributes seems like it could be an easy, backward compatible
addition:
 
  attr(attributes...)
 
 then define some obvious generic attributes like "inline" (which is
(always|force)_inline, as it's the only one that makes sense), "noinline",
"hot", "cold" etc. This lets you write " attr(inline) int f(i){}" etc, but
doesn't help the vendor specific attr case at all, unfortunately. [2]

User defined attributes. 

" attr regparm=whatever_the_compiler_uses;", conditioned on version(). 

Would let you do " attr(regparm) int f(int i){}" in a portable way. Still not
sure i like it, but could work. Would have to accept and expand to multiple
attrs though...

artur

Jan 04 2012

Artur Skawina <art.08.09 gmail.com> writes:

On 01/04/12 10:39, Manu wrote:
 Walter made an argument "The same goes for all those language extensions you
mentioned. Those are not part of Standard C. They are vendor extensions. Does
that mean that C is not actually a systems language? No."
 This is absurd... are you saying that you expect Iain to add these things to
GDC to that people can use them, and then create incompatible D code with the
'standard' compiler?

Some of these things are *already* in GDC... Probably not documented and tested
enough [1], but they are there. So you /can/ have function declarations such as:

pragma(GNU_attribute, always_inline, flatten, hot) int fxx(int i) { ... }

Now, this wouldn't be that bad, if we had a preprocessor or some kind of macro
facility. But as it is, writing portable code is too expensive. (I may need to
add a cpp stage for D because of issues like this, haven't decided yet...)

 Why would you intentionally fragment the compiler support of language features
rather than just making trivial (but important) features that people do use
part of the language?

There are more issues, that *will* be fixed in time, once (maybe even "if") D
matures. A wiki page etc listing all the needed changes ("D next generation")
would definitively be helpful. Not only to record what needs fixing, but also
what to avoid. Could reduce the inevitable balkanization significantly.


Functions attributes seems like it could be an easy, backward compatible
addition:

 attr(attributes...)

then define some obvious generic attributes like "inline" (which is
(always|force)_inline, as it's the only one that makes sense), "noinline",
"hot", "cold" etc. This lets you write " attr(inline) int f(i){}" etc, but
doesn't help the vendor specific attr case at all, unfortunately. [2]

artur

[1] Like the "target" "tune" (sub)attribute, which is not per function, but
global (ie behaves as C #pragma). That might be a gcc bug though. Also, using
gcc asm in a function makes the compiler generate worse code (triggered by the
/first/ asm use, next ones within a function are free). 

[2] The problem is what do you do when you have a lot of function/methods that
need to be inlined/flattened, have a different calling convention or otherwise
need to be specially marked _and_ it needs to be done differently for different
compilers?...

Jan 04 2012

Jacob Carlborg <doob me.com> writes:

On 2012-01-04 16:31, Artur Skawina wrote:
 On 01/04/12 10:39, Manu wrote:
 Walter made an argument "The same goes for all those language extensions you
mentioned. Those are not part of Standard C. They are vendor extensions. Does
that mean that C is not actually a systems language? No."
 This is absurd... are you saying that you expect Iain to add these things to
GDC to that people can use them, and then create incompatible D code with the
'standard' compiler?

 Some of these things are *already* in GDC... Probably not documented and
tested enough [1], but they are there. So you /can/ have function declarations
such as:

 pragma(GNU_attribute, always_inline, flatten, hot) int fxx(int i) { ... }

If you want your code to be portable (between compilers) you would need 
to wrap that in a version statement.

-- 
/Jacob Carlborg

Jan 04 2012

Artur Skawina <art.08.09 gmail.com> writes:

On 01/05/12 08:19, Jacob Carlborg wrote:
 On 2012-01-04 16:31, Artur Skawina wrote:
 On 01/04/12 10:39, Manu wrote:
 Walter made an argument "The same goes for all those language extensions you
mentioned. Those are not part of Standard C. They are vendor extensions. Does
that mean that C is not actually a systems language? No."
 This is absurd... are you saying that you expect Iain to add these things to
GDC to that people can use them, and then create incompatible D code with the
'standard' compiler?

 Some of these things are *already* in GDC... Probably not documented and
tested enough [1], but they are there. So you /can/ have function declarations
such as:

 pragma(GNU_attribute, always_inline, flatten, hot) int fxx(int i) { ... }

 
 If you want your code to be portable (between compilers) you would need to
wrap that in a version statement.
 

Exactly. Which isn't a problem if you have one or two such functions. But
becomes one when you have hundreds. And different compilers use different
conventions, some do not support every feature and/or need specific tweaks.
Copy-and-pasting multiline "declaration attribute blocks" for every function
that needs them does not really scale well.
In C/C++ this is CPP territory where you solve it with a #define, and all of
the magic is both hidden and easily accessible in one place. Adding support for
another compiler requires only editing of that one header, not modifying
practically the whole project. Let's not even think about compiler version
specific tweaks (due to compiler bugs or features appearing in newer
versions)...

D, being in its infancy, may have been able to ignore these issues so far
(having only one D frontend helps too), but w/o a std, every vendor will have
to invent a way to expose non-std features. For common things such as forcing
functions to be inlined, keeping them out of line, marking them as hot/cold,
putting them in specific text sections etc relying on vendor extensions is not
really necessary.

It's bad enough that every compiler will use a different incompatible runtime,
in some cases calling conventions - and consequently different shared
libraries; reducing source code portability (even if just by making things
harder than they should be) will lead to more balkanization...

artur

Jan 05 2012

Jacob Carlborg <doob me.com> writes:

On 2012-01-05 10:54, Artur Skawina wrote:
 On 01/05/12 08:19, Jacob Carlborg wrote:
 On 2012-01-04 16:31, Artur Skawina wrote:
 On 01/04/12 10:39, Manu wrote:
 Walter made an argument "The same goes for all those language extensions you
mentioned. Those are not part of Standard C. They are vendor extensions. Does
that mean that C is not actually a systems language? No."
 This is absurd... are you saying that you expect Iain to add these things to
GDC to that people can use them, and then create incompatible D code with the
'standard' compiler?

 Some of these things are *already* in GDC... Probably not documented and
tested enough [1], but they are there. So you /can/ have function declarations
such as:

 pragma(GNU_attribute, always_inline, flatten, hot) int fxx(int i) { ... }

 If you want your code to be portable (between compilers) you would need to
wrap that in a version statement.

 Exactly. Which isn't a problem if you have one or two such functions. But
becomes one when you have hundreds. And different compilers use different
conventions, some do not support every feature and/or need specific tweaks.
Copy-and-pasting multiline "declaration attribute blocks" for every function
that needs them does not really scale well.
 In C/C++ this is CPP territory where you solve it with a #define, and all of
the magic is both hidden and easily accessible in one place. Adding support for
another compiler requires only editing of that one header, not modifying
practically the whole project. Let's not even think about compiler version
specific tweaks (due to compiler bugs or features appearing in newer
versions)...

 D, being in its infancy, may have been able to ignore these issues so far
(having only one D frontend helps too), but w/o a std, every vendor will have
to invent a way to expose non-std features. For common things such as forcing
functions to be inlined, keeping them out of line, marking them as hot/cold,
putting them in specific text sections etc relying on vendor extensions is not
really necessary.

 It's bad enough that every compiler will use a different incompatible runtime,
in some cases calling conventions - and consequently different shared
libraries; reducing source code portability (even if just by making things
harder than they should be) will lead to more balkanization...

 artur

The pragma is a standard way to expose non-standard features. You just 
need to wrap it in version statements because the compiler will 
otherwise complain about unrecognized pragmas. If that's a good thing or 
not, I don't know.

-- 
/Jacob Carlborg

Jan 05 2012

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Thursday, 5 January 2012 at 13:40:20 UTC, Jacob Carlborg wrote:
 The pragma is a standard way to expose non-standard features. 
 You just need to wrap it in version statements because the 
 compiler will otherwise complain about unrecognized pragmas. If 
 that's a good thing or not, I don't know.

DMD has an -ignore switch:

  -ignore        ignore unsupported pragmas

Jan 05 2012

Jacob Carlborg <doob me.com> writes:

On 2012-01-05 14:59, Vladimir Panteleev wrote:
 On Thursday, 5 January 2012 at 13:40:20 UTC, Jacob Carlborg wrote:
 The pragma is a standard way to expose non-standard features. You just
 need to wrap it in version statements because the compiler will
 otherwise complain about unrecognized pragmas. If that's a good thing
 or not, I don't know.

 DMD has an -ignore switch:

 -ignore ignore unsupported pragmas

I had no idea.

-- 
/Jacob Carlborg

Jan 06 2012

Manu <turkeyman gmail.com> writes:

Oh, and virtual-by-default... completely unacceptable for a systems
language. most functions are NOT virtual, and finding the false-virtuals
while optimising will be extremely tedious and time consuming. Worse, if
libraries contain false virtuals, there's good chance I may not be able to
use said library on certain architectures (PPC, ARM in particular).
Terrible decision... completely contrary to modern hardware design and
trends. Why invent a 'new' language for 10 year old hardware?

On 4 January 2012 11:39, Manu <turkeyman gmail.com> wrote:

 This conversation has meandered into one very specific branch, but I just
 want to add my 2c to the OP.
 I agree, I want D to be a useful systems language too. These are my issues
 to that end:

  * __forceinline ... I wasn't aware this didn't exist... and yes, despite
 all this discussion, I still depend on this all the time. People are
 talking about implementing forceinline by immitating macros using mixins...
 crazy? Here's a solid reason I avoid mixins or procedurally generated code
 (and the preprocessor in C for that matter, in favour of __forceinline):
 YOU CAN DEBUG IT. In an inline function, the code exists in the source
 file, just like any other function, you can STEP THE DEBUGGER through it,
 and inspect the values easily. This is an underrated requirement. I would
 waste hours on many days if I couldn't do this. I would only ever use
 string mixins for the most obscure uses, preferring inline functions for
 the sake of debugging 99% of the time.

  * vector type ... D has exactly no way to tell the compiler to allocate
 128bit vector registers, load/store, and pass then to/from functions. That
 is MOST of the register memory on virtually every modern processor, and D
 can't address it... wtf?

  * inline assembler needs pseudo registers ... The inline assembler is
 pretty crap, immitating C which is out-dated. Registers in assembly code
 should barely ever be addressed directly, they should only be addressed by
 TYPE, allowing the compiler to allocate available registers (and/or manage
 storing the the stack where required) as with any other code. Inline
 assembly without pseudo-registers is almost always an un-optimisation, and
 this is also the reason why almost all C programmers use hardware opcode
 intrinsics instead of inline assembly. There is no way without using
 intrinsics in C to allow the compiler to perform optimal register
 allocation, and this is still true for D, and in my opinion, just plain
 broken.

  * __restrict ... I've said this before, but not being able to hint that
 the compiler ignore possible pointer aliasing is a big performance problem,
 especially when interacting with C libs.

  * multiple return values (in registers) ... (just because I opened a
 topic about it before) This saves memory accesses in common cases where i
 want to return (x, y), or (retVal, errorCode) for instance.

 Walter made an argument "The same goes for all those language extensions
 you mentioned. Those are not part of Standard C. They are vendor
 extensions. Does that mean that C is not actually a systems language? No."
 This is absurd... are you saying that you expect Iain to add these things
 to GDC to that people can use them, and then create incompatible D code
 with the 'standard' compiler?
 Why would you intentionally fragment the compiler support of language
 features rather than just making trivial (but important) features that
 people do use part of the language?

 This is a great example of why C is shit, and a good example of why I'm
 interested in D at all...

 On 29 December 2011 13:19, Vladimir Panteleev <vladimir thecybershadow.net
 wrote:

 On Thursday, 29 December 2011 at 09:16:23 UTC, Walter Bright wrote:

 Are you a ridiculous hacker? Inline x86 assembly that the compiler
 actually understands in 32 AND 64 bit code, hex string literals like x"DE
 ADB EEF" where spacing doesn't matter, the ability to set data alignment
 cross-platform with type.alignof = 16, load your shellcode verbatim into a
 string like so: auto str = import("shellcode.txt");

 I would like to talk about this for a bit. Personally, I think D's system
 programming abilities are only half-way there. Note that I am not talking
 about use cases in high-level application code, but rather low-level,
 widely-used framework code, where every bit of performance matters (for
 example: memory copy routines, string builders, garbage collectors).

 In-line assembler as part of the language is certainly neat, and in fact
 coming from Delphi to C++ I was surprised to learn that C++ implementations
 adopted different syntax for asm blocks. However, compared to some C++
 compilers, it has severe limitations and is D's only trick in this alley.

 For one thing, there is no way to force the compiler to inline a function
 (like __forceinline / __attribute((always_inline)) ). This is fine for
 high-level code (where users are best left with PGO and "the compiler knows
 best"), but sucks if you need a guarantee that the function must be
 inlined. The guarantee isn't just about inlining heuristics, but also
 implementation capabilities. For example, some implementations might not be
 able to inline functions that use certain language features, and your
 code's performance could demand that such a short function must be inlined.
 One example of this is inlining functions containing asm blocks - IIRC DMD
 does not support this. The compiler should fail the build if it can't
 inline a function tagged with  forceinline, instead of shrugging it off and
 failing silently, forcing users to check the disassembly every time.

 You may have noticed that GCC has some ridiculously complicated assembler
 facilities. However, they also open the way to the possibilities of writing
 optimal code - for example, creating custom calling conventions, or
 inlining assembler functions without restricting the caller's register
 allocation with a predetermined calling convention. In contrast, DMD is
 very conservative when it comes to mixing D and assembler. One time I found
 that putting an asm block in a function turned what were single
 instructions into blocks of 6 instructions each.

 D's lacking in this area makes it impossible to create language features
 that are on the level of D's compiler built-ins. For example, I have tested
 three memcpy implementations recently, but none of them could beat DMD's
 standard array slice copy (despite that in release mode it compiles to a
 simple memcpy call). Why? Because the overhead of using a custom memcpy
 routine negated its performance gains.

 This might have been alleviated with the presence of sane macros, but no
 such luck. String mixins are not the answer: trying to translate
 macro-heavy C code to D using string mixins is string escape hell, and
 we're back to the level of shell scripts.

 We've discussed this topic on IRC recently. From what I understood,
 Andrei thinks improvements in this area are not "impactful" enough, which I
 find worrisome.

 Personally, I don't think D qualifies as a true "system programming
 language" in light of the above. It's more of a compiled language with
 pointers and assembler. Before you disagree with any of the above, first
 (for starters) I'd like to invite you to translate Daniel Vik's C memcpy
 implementation to D: http://www.danielvik.com/2010/**
 02/fast-memcpy-in-c.html<http://www.danielvik.com/2010/02/fas
-memcpy-in-c.html>. It doesn't even use inline assembler or compiler intrinsics.

Jan 04 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 1/4/2012 10:53 AM, Manu wrote:
 Oh, and virtual-by-default... completely unacceptable for a systems language.
  most functions are NOT virtual, and finding the false-virtuals while
 optimising will be extremely tedious and time consuming.

The only reason to use classes in D is for polymorphic behavior - and that means
virtual functions. Even so, a class member function will be called directly if
it is private or marked as 'final'.

An easy way to find functions that are not overridden (what you called false 
virtuals) is to add:

    final:

at the top of your class definition. The compiler will give you errors for any 
functions that need to be virtual.

If you don't want polymorphic behavior, use structs instead. Struct member
functions are never virtual.


 Worse, if libraries contain false virtuals, there's good chance I may not be
 able to use said library on certain architectures (PPC, ARM in particular).

??

Jan 04 2012

bearophile <bearophileHUGS lycos.com> writes:

Walter:

 The only reason to use classes in D is for polymorphic behavior - and that
means
 virtual functions.

I don't agree, in some cases I use final class instances instead of
heap-allocated structs even when I don't need polymorphic behaviour just to
avoid pointer syntax (there is also a bit higher probability of destructors
being called, compared to heap-allocated structs).
In some cases I've used a final class just to be able to use a this() with no
arguments :-)

Bye,
bearophile

Jan 04 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 1/4/2012 3:21 PM, bearophile wrote:
 The only reason to use classes in D is for polymorphic behavior - and that
 means virtual functions.

 I don't agree, in some cases I use final class instances instead of
 heap-allocated structs even when I don't need polymorphic behaviour just to
 avoid pointer syntax (there is also a bit higher probability of destructors
 being called, compared to heap-allocated structs). In some cases I've used a
 final class just to be able to use a this() with no arguments :-)


There's no reason to avoid pointer syntax, because D has:

1. ref types

2. automatic dereferencing of pointers

Jan 04 2012

Jacob Carlborg <doob me.com> writes:

On 2012-01-05 00:21, bearophile wrote:
 Walter:

 The only reason to use classes in D is for polymorphic behavior - and that
means
 virtual functions.

 I don't agree, in some cases I use final class instances instead of
heap-allocated structs even when I don't need polymorphic behaviour just to
avoid pointer syntax (there is also a bit higher probability of destructors
being called, compared to heap-allocated structs).
 In some cases I've used a final class just to be able to use a this() with no
arguments :-)

 Bye,
 bearophile

You can get that with a static opCall for structs too.

-- 
/Jacob Carlborg

Jan 04 2012

Manu <turkeyman gmail.com> writes:

You just missed a big discussion on IRC about this, where I think I made
some fair points that people actually agreed with.

On 1/4/2012 10:53 AM, Manu wrote:
 Oh, and virtual-by-default... completely unacceptable for a systems
 language.
  most functions are NOT virtual, and finding the false-virtuals while
 optimising will be extremely tedious and time consuming.

 The only reason to use classes in D is for polymorphic behavior - and that
 means
 virtual functions. Even so, a class member function will be called
 directly if
 it is private or marked as 'final'.

Is this true? Surely the REAL reason to use classes is to allocate using
the GC?
Aren't struct's allocated on the stack, and passed to functions by value?
Do I need to start using the ref keyword to use GC allocated structs?



 An easy way to find functions that are not overridden (what you called
 false virtuals) is to add:

   final:

 at the top of your class definition. The compiler will give you errors for
 any functions that need to be virtual.

 If you don't want polymorphic behavior, use structs instead. Struct member
 functions are never virtual.


I have never written a class in any language where the ratio of virtual to
non-virtual functions is more than 1:10 or so... requiring that one
explicitly declared the vastly more common case seems crazy.
The thing I'm most worried about is people forgetting to declare 'final:'
on a class, or junior programmers who DON'T declare final, perhaps because
they don't understand it, or perhaps because they have 1-2 true-virtuals,
and the rest are just defined in the same place... This is DANGEROUS. The
junior programmer problem is one that can NOT be overstated, and doesn't
seem to have been considered in a few design choices.
I'll bet MOST classes result in an abundance of false-virtuals, and this is
extremely detrimental to performance on modern hardware (and getting worse,
not better, as hardware progresses).



  Worse, if libraries contain false virtuals, there's good chance I may not
 be
 able to use said library on certain architectures (PPC, ARM in
 particular).

 ??

If a library makes liberal (and completely unnecessary) virtual calls to
the point where it performs too poorly on some architecture; lets say ARM,
or PPC (architectures that will suffer far more than x86 form virtual
calls), I can no longer use this library in my project... What a stupid
position to be in. The main strength of any language is its wealth of
libraries available, and a bad language decision prohibiting use of
libraries for absolutely no practical reason is just broken by my measure.

Jan 04 2012

Timon Gehr <timon.gehr gmx.ch> writes:

On 01/05/2012 12:26 AM, Manu wrote:
 You just missed a big discussion on IRC about this, where I think I made
 some fair points that people actually agreed with.

     On 1/4/2012 10:53 AM, Manu wrote:

         Oh, and virtual-by-default... completely unacceptable for a
         systems language.
           most functions are NOT virtual, and finding the false-virtuals
         while
         optimising will be extremely tedious and time consuming.


     The only reason to use classes in D is for polymorphic behavior -
     and that means
     virtual functions. Even so, a class member function will be called
     directly if
     it is private or marked as 'final'.


 Is this true? Surely the REAL reason to use classes is to allocate using
 the GC?

You can allocate any type using the GC.

 Aren't struct's allocated on the stack, and passed to functions by
 value? Do I need to start using the ref keyword to use GC allocated structs?

No.

     An easy way to find functions that are not overridden (what you
     called false virtuals) is to add:

        final:

     at the top of your class definition. The compiler will give you
     errors for any functions that need to be virtual.

     If you don't want polymorphic behavior, use structs instead. Struct
     member
     functions are never virtual.


 I have never written a class in any language where the ratio of virtual
 to non-virtual functions is more than 1:10 or so... requiring that one
 explicitly declared the vastly more common case seems crazy.

Are you sure that is the case?
In my code, most class member functions are true virtual.

Jan 04 2012

Manu <turkeyman gmail.com> writes:

On 5 January 2012 01:40, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 01/05/2012 12:26 AM, Manu wrote:

 You just missed a big discussion on IRC about this, where I think I made
 some fair points that people actually agreed with.

    On 1/4/2012 10:53 AM, Manu wrote:

        Oh, and virtual-by-default... completely unacceptable for a
        systems language.
          most functions are NOT virtual, and finding the false-virtuals
        while
        optimising will be extremely tedious and time consuming.


    The only reason to use classes in D is for polymorphic behavior -
    and that means
    virtual functions. Even so, a class member function will be called
    directly if
    it is private or marked as 'final'.


 Is this true? Surely the REAL reason to use classes is to allocate using
 the GC?

 You can allocate any type using the GC.

  Aren't struct's allocated on the stack, and passed to functions by
 value? Do I need to start using the ref keyword to use GC allocated
 structs?

 No.

     An easy way to find functions that are not overridden (what you
    called false virtuals) is to add:

       final:

    at the top of your class definition. The compiler will give you
    errors for any functions that need to be virtual.

    If you don't want polymorphic behavior, use structs instead. Struct
    member
    functions are never virtual.


 I have never written a class in any language where the ratio of virtual
 to non-virtual functions is more than 1:10 or so... requiring that one
 explicitly declared the vastly more common case seems crazy.

 Are you sure that is the case?
 In my code, most class member functions are true virtual.

Here's one I'm working on right now (C++).
Base class for a UI system, surely one of the most heavily polymorphic
types of code one can imagine.

Count the virtuals... http://pastebin.com/dLUVvFsL

Jan 04 2012

Timon Gehr <timon.gehr gmx.ch> writes:

On 01/05/2012 12:54 AM, Manu wrote:
 On 5 January 2012 01:40, Timon Gehr <timon.gehr gmx.ch
 <mailto:timon.gehr gmx.ch>> wrote:

     On 01/05/2012 12:26 AM, Manu wrote:

         You just missed a big discussion on IRC about this, where I
         think I made
         some fair points that people actually agreed with.

             On 1/4/2012 10:53 AM, Manu wrote:

                 Oh, and virtual-by-default... completely unacceptable for a
                 systems language.
                   most functions are NOT virtual, and finding the
         false-virtuals
                 while
                 optimising will be extremely tedious and time consuming.


             The only reason to use classes in D is for polymorphic
         behavior -
             and that means
             virtual functions. Even so, a class member function will be
         called
             directly if
             it is private or marked as 'final'.


         Is this true? Surely the REAL reason to use classes is to
         allocate using
         the GC?


     You can allocate any type using the GC.

         Aren't struct's allocated on the stack, and passed to functions by
         value? Do I need to start using the ref keyword to use GC
         allocated structs?


     No.

             An easy way to find functions that are not overridden (what you
             called false virtuals) is to add:

                final:

             at the top of your class definition. The compiler will give you
             errors for any functions that need to be virtual.

             If you don't want polymorphic behavior, use structs instead.
         Struct
             member
             functions are never virtual.


         I have never written a class in any language where the ratio of
         virtual
         to non-virtual functions is more than 1:10 or so... requiring
         that one
         explicitly declared the vastly more common case seems crazy.


     Are you sure that is the case?
     In my code, most class member functions are true virtual.


 Here's one I'm working on right now (C++).
 Base class for a UI system, surely one of the most heavily polymorphic
 types of code one can imagine.

Apparently that is not true.

 Count the virtuals... http://pastebin.com/dLUVvFsL

9/~65 approx 1:6.

Jan 04 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 1/4/2012 3:26 PM, Manu wrote:
 Is this true? Surely the REAL reason to use classes is to allocate using the
GC?
 Aren't struct's allocated on the stack, and passed to functions by value? Do I
 need to start using the ref keyword to use GC allocated structs?

struct S { ... }

S* s = new S(); // struct is allocated on the GC


 I have never written a class in any language where the ratio of virtual to
 non-virtual functions is more than 1:10 or so... requiring that one explicitly
 declared the vastly more common case seems crazy.

I found the opposite to be true when I use OOP in C++. Either scheme is valid, 
saying one is crazy or prohibitive way overstates the case.

(I have had a lot of bad experiences in C++ with accidentally overriding a 
non-virtual function. It's perfectly valid in C++, but man does your code
behave 
bizarrely when you do it.)

In any sensible class design, you're going to have to decide which functions
are 
overrideable and which are not. There's no way around it, and no magic default.


 The thing I'm most worried about is people forgetting to declare 'final:' on a
 class, or junior programmers who DON'T declare final, perhaps because they
don't
 understand it, or perhaps because they have 1-2 true-virtuals, and the rest are
 just defined in the same place... This is DANGEROUS.

It isn't dangerous, it is just less optimal. What is dangerous is (in C++) the 
ability to override a non-virtual function, and the use of non-virtual
destructors.

It's also true that D's design makes it possible for a compiler to make direct 
calls if it is doing whole-program analysis and determines that there are no 
overrides of it.

Jan 04 2012

bearophile <bearophileHUGS lycos.com> writes:

Walter:

 What is dangerous is (in C++) the 
 ability to override a non-virtual function, and the use of non-virtual
destructors.

There is something left that I'd like to see D care more about, method hiding:


class Foo {
    string name = "c1";
    static void foo() {}
}
class Bar : Foo {
    string name = "c2";
    static void foo() {} // silent method hiding
}
void main() {}




class Foo {
    string name = "c1";
    static void foo() {}
}
class Bar : Foo {
    string name = "c2";
    static new void foo() {} // method hiding is now visible
}
void main() {}

Bye,
bearophile

Jan 04 2012

"Jesse Phillips" <jessekphillips+D gmail.com> writes:

On Thursday, 5 January 2012 at 01:36:44 UTC, bearophile wrote:
 Walter:

 What is dangerous is (in C++) the ability to override a 
 non-virtual function, and the use of non-virtual destructors.

 There is something left that I'd like to see D care more about, 
 method hiding:


 class Foo {
   string name = "c1";
   static void foo() {}
 }
 class Bar : Foo {
   string name = "c2";
   static void foo() {} // silent method hiding
 }
 void main() {}

Should we just disallow this? If the function wasn't static it 
would just override foo. Or is that changing once override is 
required?

Jan 04 2012

bearophile <bearophileHUGS lycos.com> writes:

Jesse Phillips:

 class Foo {
   string name = "c1";
   static void foo() {}
 }
 class Bar : Foo {
   string name = "c2";
   static void foo() {} // silent method hiding
 }
 void main() {}

 
 Should we just disallow this?

Sometimes it's an useful idiom, and probably some D code in the wild is using
it already, so I don't think we should disallow it.

I was just asking to force it to be syntactically explicit, just like override
will do in D2. It seems Delphi too does the same thing using a different
keyword (this is not too much surprising, the language designers are partially
the same).

So far I have seen no arguments against the requirement (initially just a
warning if you compile with -w) to use a keywords as "new" there, while I have



 If the function wasn't static it would just override foo.
 Or is that changing once override is required?

Override usage is going to be (hopefully soon) compulsory in D (currently you
need -w to see an error). So that code without both static and override is
going to be refused :-)

Bye,
bearophile

Jan 05 2012

"Jesse Phillips" <jessekphillips+D gmail.com> writes:

On Thursday, 5 January 2012 at 23:12:21 UTC, bearophile wrote:
 Override usage is going to be (hopefully soon) compulsory in D 
 (currently you need -w to see an error). So that code without 
 both static and override is going to be refused :-)

 Bye,
 bearophile

I guess the question I was getting at, currently there is no way 

with 'new.' Is that intended once 'override' is required? and if 
not why have 'new' usable for static methods?

Jan 05 2012

bearophile <bearophileHUGS lycos.com> writes:

Jesse Phillips:

 currently there is no way 

 with 'new.' Is that intended once 'override' is required? and if 
 not why have 'new' usable for static methods?

As far as I know no other changes are planned linked to the introduction of
compulsory 'override'.
For the other questions probably Walter or Andrei are able to give you much
better answer than me.

Bye,
bearophile

Jan 05 2012

Manu <turkeyman gmail.com> writes:

 The thing I'm most worried about is people forgetting to declare 'final:'
 on a
 class, or junior programmers who DON'T declare final, perhaps because
 they don't
 understand it, or perhaps because they have 1-2 true-virtuals, and the
 rest are
 just defined in the same place... This is DANGEROUS.

 It isn't dangerous, it is just less optimal. What is dangerous is (in C++)
 the ability to override a non-virtual function, and the use of non-virtual
 destructors.

In 15 years I have never once overridden a non-virtual function, assuming
it was virtual, and wondering why it didn't work... have you?
I've never even heard a story of a colleague, or even on the net of that
ever happening (yes, I'm sure if I google specifically for it, I could find
it, but it's never appeared is an article or such)... but I can point you
at almost daily examples of junior programmers making silly mistakes that
go un-noticed by their seniors. Especially common are mistakes in
declaration where declaration attributes don't change whether the program
builds and works or not.

It seems to me the decision is that of sacrificing a real and common
problem case with frequent and tangible evidence, for the feeling that the
language is defined to do the 'right' thing?


 It's also true that D's design makes it possible for a compiler to make
 direct calls if it is doing whole-program analysis and determines that
 there are no overrides of it.

This is only possible with whole program optimisation, and some very crafty
code that may or may not ever be implemented, and certainly isn't
dependable from compiler vendor 'x'.. There would simply be no problem in
the first place if the default was declared the other way around, and the
compiler would need none of that extra code, and there are no problems of
compiler maturity.
Surely this sort of consideration is even more important for an open source
project with a relatively small team like D than it is even for C++?

Jan 05 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 1/5/2012 1:13 AM, Manu wrote:
 In 15 years I have never once overridden a non-virtual function, assuming it
was
 virtual, and wondering why it didn't work... have you?

Yes, I have. With a complex inheritance hierarchy, I was doing the optimization 
thing and removing 'virtual' from members that didn't get overridden. Then,
some 
long time later, I'd add an override and forget to put the 'virtual' back on
the 
original.

Very strange behavior of my program would result.

Even worse, frankly I defy anyone to look at a complex C++ inheritance
hierarchy 
and say with certainty that you've verified that there are no overrides of 
non-virtual functions in it.

 I've never even heard a story of a colleague, or even on the net of that ever
 happening (yes, I'm sure if I google specifically for it, I could find it, but
 it's never appeared is an article or such)... but I can point you at almost
 daily examples of junior programmers making silly mistakes that go un-noticed
by
 their seniors. Especially common are mistakes in declaration where declaration
 attributes don't change whether the program builds and works or not.

That is the case with overriding a non-virtual function - the compiler will 
compile it anyway, and most of the time it will work. That's what makes it so 
eeevil.


 It seems to me the decision is that of sacrificing a real and common problem
 case with frequent and tangible evidence, for the feeling that the language is
 defined to do the 'right' thing?

The right thing should be the default.


     It's also true that D's design makes it possible for a compiler to make
     direct calls if it is doing whole-program analysis and determines that
there
     are no overrides of it.


 This is only possible with whole program optimisation, and some very crafty
code
 that may or may not ever be implemented, and certainly isn't dependable from
 compiler vendor 'x'.. There would simply be no problem in the first place if
the
 default was declared the other way around, and the compiler would need none of
 that extra code, and there are no problems of compiler maturity.
 Surely this sort of consideration is even more important for an open source
 project with a relatively small team like D than it is even for C++?

I feel the correct decision was made. But regardless, there's no way to reverse 
that decision, as it will break most every D program in existence, and be a
HUGE 
annoyance to everyone who has D code.

Jan 05 2012

Manu <turkeyman gmail.com> writes:

 That is the case with overriding a non-virtual function - the compiler
 will compile it anyway, and most of the time it will work. That's what
 makes it so eeevil.


I saw today, or last night, someone suggesting a keyword to make
non-virtual override explicit, and error otherwise. Which actually sounded
like a really good idea to me, and also addresses this problem.
I think a combination of not-virtual-by-default, and an explicit
non-virtual override keyword would cover your concern, and also minimise
the use of virtual functions. Sounds perfect to me ;)
Overriding a non-virtual is actually very rare, and probably often
unintended... I really like the idea of a keyword to make this rare use
explicit.


 It seems to me the decision is that of sacrificing a real and common
 problem
 case with frequent and tangible evidence, for the feeling that the
 language is
 defined to do the 'right' thing?

 The right thing should be the default.

But I fundamentally disagree your choice is 'right'.. This is obviously
subjective, so I don't think that's a fair assertion.
The problem was obviously not completely defined, and not addressed
entirely.. I think the proposal above sounds like a better solution all
round, it addresses everyones concerns, and adds a nice little safety bonus
for rare non-virtual overriding ;)
But as I've previously said, I understand this can't change now, I've let
it go :P

Jan 05 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 1/5/2012 1:03 PM, Manu wrote:
     That is the case with overriding a non-virtual function - the compiler will
     compile it anyway, and most of the time it will work. That's what makes it
     so eeevil.


 I saw today, or last night, someone suggesting a keyword to make non-virtual
 override explicit, and error otherwise. Which actually sounded like a really
 good idea to me, and also addresses this problem.

That's correct, it does address it. But not for C++.

     The right thing should be the default.
 But I fundamentally disagree your choice is 'right'..

Sure.

 This is obviously subjective, so I don't think that's a fair assertion.

By 'right', I don't necessarily mean 'the most efficient'. I mean that the code 
should be correct. It's ok if extra work is involved in creating the most 
efficient version. For example:

     int a;

automatically initializes a to zero. This is correct. If you want it to remain 
uninitialized,

     int a = void;

which will be faster in the cases where the compiler cannot optimize away a 
redundant initialization of a. But, it is dangerous because the compiler cannot 
always prove that a is initialized before use, hence it is not the default.


 But as I've previously said, I understand this can't change now, I've let it
go :P

I understand, I'm just explaining my point of view, and you're just explaining 
yours.

Jan 05 2012

Manu <turkeyman gmail.com> writes:

    The right thing should be the default.
 But I fundamentally disagree your choice is 'right'..

 Sure.


  This is obviously subjective, so I don't think that's a fair assertion.

 By 'right', I don't necessarily mean 'the most efficient'. I mean that the
 code should be correct. It's ok if extra work is involved in creating the
 most efficient version.


But this solution is equally correct, and doesn't make any sacrifice for
the most efficient version:
  methods are not virtual by default.
  overriding any common method is an error (great, now I know if I've made
any sort of mistake)
  a method declared virtual may be overridden expected, and
virtual-ness safely confirmed by the lack of compile error.
  to override a regular method (a rare thing to do, but still your primary
safety concern), you use an explicit keyword to do it. now it's absolutely
intentional.

This provides all the same safety guarantees, ie, your 'right'-ness, and
doesn't sacrifice: performance/false-virtual risk, 'final' keyword spam,
risk of forgetfulness and the junior coder factor... surely this is MORE
'right', by any measure? :)

Jan 05 2012

Sean Kelly <sean invisibleduck.org> writes:

On Jan 5, 2012, at 1:03 PM, Manu wrote:

 That is the case with overriding a non-virtual function - the compiler =

will compile it anyway, and most of the time it will work. That's what =
makes it so eeevil.
=20
 I saw today, or last night, someone suggesting a keyword to make =

non-virtual override explicit, and error otherwise. Which actually =
sounded like a really good idea to me, and also addresses this problem.

I think the override keyword fits here, though in reverse.=

Jan 05 2012

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Thursday, 5 January 2012 at 21:05:07 UTC, Sean Kelly wrote:
 On Jan 5, 2012, at 1:03 PM, Manu wrote:

 That is the case with overriding a non-virtual function - the 
 compiler will compile it anyway, and most of the time it will 
 work. That's what makes it so eeevil.
 
 I saw today, or last night, someone suggesting a keyword to 
 make non-virtual override explicit, and error otherwise. Which 
 actually sounded like a really good idea to me, and also 
 addresses this problem.

 I think the override keyword fits here, though in reverse.

Jan 05 2012

Sean Kelly <sean invisibleduck.org> writes:

On Jan 4, 2012, at 3:26 PM, Manu wrote:
=20
 If a library makes liberal (and completely unnecessary) virtual calls =

to the point where it performs too poorly on some architecture; lets say =
ARM, or PPC (architectures that will suffer far more than x86 form =
virtual calls), I can no longer use this library in my project... What a =
stupid position to be in. The main strength of any language is its =
wealth of libraries available, and a bad language decision prohibiting =
use of libraries for absolutely no practical reason is just broken by my =
measure.

If a library is written without consideration to what is virtual and =
what is not, its performance will be the least of your problems.  Either =
way, this ship has long since sailed.  The impact of reversing this =
setting would be enormous.=

Jan 04 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 1/4/2012 4:30 PM, Sean Kelly wrote:
 If a library is written without consideration to what is virtual and what is
 not, its performance will be the least of your problems.

I agree. Such is a massive failure in designing a polymorphic type, and the 
language can't help with that.

Jan 04 2012

Manu <turkeyman gmail.com> writes:

On 5 January 2012 03:06, Walter Bright <newshound2 digitalmars.com> wrote:

 On 1/4/2012 4:30 PM, Sean Kelly wrote:

 If a library is written without consideration to what is virtual and what
 is
 not, its performance will be the least of your problems.

 I agree. Such is a massive failure in designing a polymorphic type, and
 the language can't help with that.

I don't follow.. how is someone failing (or forgetting) to type 'final' a
"massive design failure"? It's not a design failure, it's not even
'wrong'... it's INEVITABLE.
And the language CAN help with that, by making expensive operations require
explicit declaration.

At least make a compiler flag so I can disable virtual-by-default for my
project...?

Jan 05 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 1/5/2012 1:16 AM, Manu wrote:
 On 5 January 2012 03:06, Walter Bright <newshound2 digitalmars.com
 <mailto:newshound2 digitalmars.com>> wrote:

     On 1/4/2012 4:30 PM, Sean Kelly wrote:

         If a library is written without consideration to what is virtual and
what is
         not, its performance will be the least of your problems.


     I agree. Such is a massive failure in designing a polymorphic type, and the
     language can't help with that.


 I don't follow.. how is someone failing (or forgetting) to type 'final' a
 "massive design failure"? It's not a design failure, it's not even 'wrong'...
 it's INEVITABLE.
 And the language CAN help with that, by making expensive operations require
 explicit declaration.


In any class design, one must decide which functions are overrideable and which 
are not. The language cannot do it for you; certainly not by switching around 
the default behavior.


 At least make a compiler flag so I can disable virtual-by-default for my 

project...?

I'm afraid that such a switch would have disastrous results, because it 
fundamentally alters the meaning of existing code.

Jan 05 2012

Timon Gehr <timon.gehr gmx.ch> writes:

On 01/04/2012 07:53 PM, Manu wrote:
 Oh, and virtual-by-default... completely unacceptable for a systems
 language. most functions are NOT virtual, and finding the false-virtuals
 while optimising will be extremely tedious and time consuming. Worse, if
 libraries contain false virtuals, there's good chance I may not be able
 to use said library on certain architectures (PPC, ARM in particular).
 Terrible decision... completely contrary to modern hardware design and
 trends. Why invent a 'new' language for 10 year old hardware?

If you don't need virtual functions don't use classes.

Jan 04 2012

Manu <turkeyman gmail.com> writes:

On 5 January 2012 01:17, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 01/04/2012 07:53 PM, Manu wrote:

 Oh, and virtual-by-default... completely unacceptable for a systems
 language. most functions are NOT virtual, and finding the false-virtuals
 while optimising will be extremely tedious and time consuming. Worse, if

 libraries contain false virtuals, there's good chance I may not be able
 to use said library on certain architectures (PPC, ARM in particular).
 Terrible decision... completely contrary to modern hardware design and
 trends. Why invent a 'new' language for 10 year old hardware?

 If you don't need virtual functions don't use classes.

Polymorphism isn't the only difference by a long shot. Allocation and
referencing patterns are totally different. I don't feel this is a
reasonable counter-argument.

Jan 04 2012

Iain Buclaw <ibuclaw ubuntu.com> writes:

On 4 January 2012 09:39, Manu <turkeyman gmail.com> wrote:
 This conversation has meandered into one very specific branch, but I just
 want to add my 2c to the OP.
 I agree, I want D to be a useful systems language too. These are my issue=

s
 to that end:

 =A0* __forceinline ... I wasn't aware this didn't exist... and yes, despi=

te
 all this discussion, I still depend on this all the time. People are talk=

ing
 about implementing forceinline by immitating macros using mixins... crazy=

?
 Here's a solid reason I avoid mixins or procedurally generated code (and =

the
 preprocessor in C for that matter, in favour of __forceinline): YOU CAN
 DEBUG IT. In an inline function, the code exists in the source file, just
 like any other function, you can STEP THE DEBUGGER through it, and inspec=

t
 the values easily. This is an underrated requirement. I would waste hours=

 on
 many days if I couldn't do this. I would only ever use string mixins for =

the
 most obscure uses, preferring inline functions for the sake of debugging =

99%
 of the time.

 =A0* vector type ... D has exactly no way to tell the compiler to allocat=

e
 128bit vector registers, load/store, and pass then to/from functions. Tha=

t
 is MOST of the register memory on virtually every modern processor, and D
 can't address it... wtf?

 =A0* inline assembler needs pseudo registers ... The inline assembler is
 pretty crap, immitating C which is out-dated. Registers in assembly code
 should barely ever be addressed directly, they should only be addressed b=

y
 TYPE, allowing the compiler to allocate available registers (and/or manag=

e
 storing the the stack where required) as with any other code. Inline
 assembly without pseudo-registers is almost always an un-optimisation, an=

d
 this is also the reason why almost all C programmers use hardware opcode
 intrinsics instead of inline assembly. There is no way without using
 intrinsics in C to allow the compiler to perform optimal register
 allocation, and this is still true for D, and in my opinion, just plain
 broken.

 =A0* __restrict ... I've said this before, but not being able to hint tha=

t the
 compiler ignore possible pointer aliasing is a big performance problem,
 especially when interacting with C libs.

 =A0* multiple return values (in registers) ... (just because I opened a t=

opic
 about it before) This saves memory accesses in common cases where i want =

to
 return (x, y), or (retVal, errorCode) for instance.

 Walter made an argument "The same goes for all those language extensions =

you
 mentioned. Those are not part of Standard C. They are vendor extensions.
 Does that mean that C is not actually a systems language? No."
 This is absurd... are you saying that you expect Iain to add these things=

 to
 GDC to that people can use them, and then create incompatible D code with
 the 'standard' compiler?
 Why would you intentionally fragment the compiler support of language
 features rather than just making trivial (but important) features that
 people do use part of the language?

Code that gdc emits is incompatible with the standard D compiler, if
that's what you want to call it, and any vendor extensions won't
contribute to that being more of the case.

Regardless, there is little reason to want to use a forced inline with
gdc.  Just like in c++ when you define all methods in the class
definition, gdc considers all methods as candidates for inlining.
Similarly, when -inline is passed, the same is also done for normal
functions that are considered inlinable by the frontend.  These
functions marked as inline are treated in the same way as a function
declared 'inline' in C or C++, and will be treated as such by the
backend.


--=20
Iain Buclaw

*(p < e ? p++ : p) =3D (c & 0x0f) + '0';

Jan 04 2012

Andrew Wiley <wiley.andrew.j gmail.com> writes:

On Wed, Jan 4, 2012 at 12:53 PM, Manu <turkeyman gmail.com> wrote:
 Oh, and virtual-by-default... completely unacceptable for a systems
 language. most functions are NOT virtual, and finding the false-virtuals
 while optimising will be extremely tedious and time consuming. Worse, if
 libraries contain false virtuals, there's good chance I may not be able to
 use said library on certain architectures (PPC, ARM in particular). Terrible
 decision... completely contrary to modern hardware design and trends. Why
 invent a 'new' language for 10 year old hardware?

The only benchmark of virtual functions on ARM that I can find is
http://mikeash.com/pyblog/performance-comparisons-of-common-operations-iphone-edition.html
, which found that the calls, when compared with other operations,
performed similarly to x86.
I'm not really sure what architecture-specific issues you're referring to here.

Jan 05 2012

Artur Skawina <art.08.09 gmail.com> writes:

On 01/05/12 02:34, Iain Buclaw wrote:
 Code that gdc emits is incompatible with the standard D compiler, if
 that's what you want to call it, and any vendor extensions won't
 contribute to that being more of the case.
 
 Regardless, there is little reason to want to use a forced inline with
 gdc.  Just like in c++ when you define all methods in the class
 definition, gdc considers all methods as candidates for inlining.
 Similarly, when -inline is passed, the same is also done for normal
 functions that are considered inlinable by the frontend.  These
 functions marked as inline are treated in the same way as a function
 declared 'inline' in C or C++, and will be treated as such by the
 backend.

"C" inline is, for historical reasons, ill-defined; i think what people are
talking about in the context of D is the equivalent of gcc
attribute(always_inline). Ie it's for the cases where not inlining is not an
option. Having an explicit C-style "inline" hint is pointless - the compiler
should be able to guess this right most of the time. It's for the cases where
the programmer already knows the answer and is not willing to let the tool make
a mistake.

artur

Jan 05 2012

Manu <turkeyman gmail.com> writes:

On 5 January 2012 03:34, Iain Buclaw <ibuclaw ubuntu.com> wrote:

 Regardless, there is little reason to want to use a forced inline with
 gdc.  Just like in c++ when you define all methods in the class
 definition, gdc considers all methods as candidates for inlining.
 Similarly, when -inline is passed, the same is also done for normal
 functions that are considered inlinable by the frontend.  These
 functions marked as inline are treated in the same way as a function
 declared 'inline' in C or C++, and will be treated as such by the
 backend.

How is this possible, when all functions are virtual, without whole program
optimisation?

Jan 05 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 1/5/2012 3:58 AM, Manu wrote:
 How is this possible, when all functions are virtual, without whole program
 optimisation?

Only public non-final class member functions are virtual. Furthermore, as in 
C++, the compiler can sometimes determine that a virtual function is being 
called directly, and will do so (or inline it).

Jan 05 2012

Manu <turkeyman gmail.com> writes:

On 5 January 2012 23:34, Walter Bright <newshound2 digitalmars.com> wrote:

 On 1/5/2012 3:58 AM, Manu wrote:

 How is this possible, when all functions are virtual, without whole
 program
 optimisation?

 Only public non-final class member functions are virtual. Furthermore, as
 in C++, the compiler can sometimes determine that a virtual function is
 being called directly, and will do so (or inline it).

Can you define 'sometimes'? I have trouble believing that this will occur
very often. The stars aligning to that level of precision seems totally
unreliable.

Consider the UI code snippet I posted earlier (I've lost the link), most
functions are public and not virtual (mostly accessors, or fairly simple
mutators), and they could only be identified to be not-overridden with
whole program optimisation...
The fact you mention potential for inline-ing actually heightens my
criticism with another detail I hadn't considered ;) ... Now all my trivial
methods won't only be virtual-called, they won't be inlined either!

I'm genuinely scared of people forgetting to type final (including
myself)... And it's hard as an external coder to go and clean up too.
Adding final blocks to someone elses existing code, you don't necessarily
know what is truly virtual or not... *mumble mumble*

Jan 05 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 1/5/2012 1:58 PM, Manu wrote:
 On 5 January 2012 23:34, Walter Bright <newshound2 digitalmars.com
 <mailto:newshound2 digitalmars.com>> wrote:

     On 1/5/2012 3:58 AM, Manu wrote:

         How is this possible, when all functions are virtual, without whole
program
         optimisation?


     Only public non-final class member functions are virtual. Furthermore, as
in
     C++, the compiler can sometimes determine that a virtual function is being
     called directly, and will do so (or inline it).


 Can you define 'sometimes'?

In C++, it does it if it's a.foo() rather than pa->foo(). In D, it can be done 
if flow analysis proves that whatever is calling foo() is really an a, and not 
something derived from a. Also, if you qualify the member call with the class 
name, it gets called directly, as in a.C.foo().

Jan 05 2012

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

About your junior comment, are virtuals really the biggest thing you
should worry about? There are infinitely many things a newbie
programmer will screw up (think linear algorithms, excessive memory
allocation, hardcoded, non-modular and thread-unsafe code, etc). I
think virtual calls are likely to be just *one* of your problems, and
probably not the biggest one.

Jan 05 2012

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

Btw, I think people are having a misconception on what it means that D
is a systems-programming language. It doesn't mean that D by default
generates the fastest code and trades safety for performance, it means
it *allows* you to write such code. But you need to be aware of what
you're coding.

Jan 05 2012

Manu <turkeyman gmail.com> writes:

On 5 January 2012 15:44, Andrej Mitrovic <andrej.mitrovich gmail.com> wrote:

 About your junior comment, are virtuals really the biggest thing you
 should worry about?


Sure it's not the biggest thing, it's one of numerous things. You'll notice
a listed a whole bunch of things in my post, and this isn't my thread,
these were in addition to the OP's comments.
I'm just trying to add some weight to the OP's sentiments, in that I feel
the same way in many areas after a few weeks of experience with D and
writing some programs, and considering it for use in future projects.


 There are infinitely many things a newbie
 programmer will screw up (think linear algorithms, excessive memory
 allocation, hardcoded, non-modular and thread-unsafe code, etc). I
 think virtual calls are likely to be just *one* of your problems, and
 probably not the biggest one.

The point is that this is one thing that is completely silently hidden, and
the language could fix this tremendously easily by nothing more than a
trivial decision of what is default.
I realise that's unlikely to happen, this decision is done now, but I think
it's important to raise this sort of issue anyway, so that future decisions
have more points in the balance.

It would also be generally nice if these concerns were acknowledged rather
than brushed off. I'm not making problems for the sake of conversation.
These are real issues that I encounter in my daily work.

Jan 05 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 1/5/2012 6:06 AM, Manu wrote:
 It would also be generally nice if these concerns were acknowledged rather than
 brushed off. I'm not making problems for the sake of conversation. These are
 real issues that I encounter in my daily work.

I do appreciate your effort in making these issues known to us.

Jan 05 2012

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Thursday, December 29, 2011 12:23:47 maarten van damme wrote:
 I think it would be an object oriented language, I'm a believer in the
 string theory :)

Well, if you want to discuss string theory...

http://xkcd.com/171/
http://xkcd.com/397/

:)

- Jonathan M Davis

Jan 02 2012

"Mattbeui" <matheus_nab hotmail.com> writes:

On Thursday, 29 December 2011 at 09:16:23 UTC, Walter Bright 
wrote:
 http://pastebin.com/AtuzJqh0

I thought this topic was about a mix of Go (Google Language) and 
D.

Jan 05 2012

D Programming

C/C++ Programming

Other

digitalmars.D - The God Language