www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - The God Language

reply Walter Bright <newshound2 digitalmars.com> writes:
http://pastebin.com/AtuzJqh0
Dec 29 2011
next sibling parent reply Max Samukha <maxsamukha gmail.com> writes:
On 12/29/2011 11:16 AM, Walter Bright wrote:
 http://pastebin.com/AtuzJqh0
He will soon realize that he wants an earthborn language rather than the one of God :)
Dec 29 2011
parent Walter Bright <newshound2 digitalmars.com> writes:
On 12/29/2011 1:32 AM, Max Samukha wrote:
 On 12/29/2011 11:16 AM, Walter Bright wrote:
 http://pastebin.com/AtuzJqh0
He will soon realize that he wants an earthborn language rather than the one of God :)
Watch out, or you may attract a thunderbolt!!
Dec 29 2011
prev sibling next sibling parent reply Caligo <iteronvexor gmail.com> writes:
On Thu, Dec 29, 2011 at 3:16 AM, Walter Bright
<newshound2 digitalmars.com>wrote:

 http://pastebin.com/AtuzJqh0
This is somewhat of a serious question: If there is a God (I'm not saying there isn't, and I'm not saying there is), what language would he choose to create the universe? It would be hard for us mortals to imagine, but would it resemble a functional programming language more or something else? And what type of hardware would the code run on? I mean, there are computations happening all around us, e.g., when an apple falls or planets circle the sun, etc, so what's performing all the computation?
Dec 29 2011
next sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Thursday, 29 December 2011 at 10:16:03 UTC, Caligo wrote:
 On Thu, Dec 29, 2011 at 3:16 AM, Walter Bright
 <newshound2 digitalmars.com>wrote:

 http://pastebin.com/AtuzJqh0
This is somewhat of a serious question: If there is a God (I'm not saying there isn't, and I'm not saying there is), what language would he choose to create the universe? It would be hard for us mortals to imagine, but would it resemble a functional programming language more or something else? And what type of hardware would the code run on? I mean, there are computations happening all around us, e.g., when an apple falls or planets circle the sun, etc, so what's performing all the computation?
Obligatory XKCD: http://xkcd.com/224/
Dec 29 2011
prev sibling next sibling parent reply Gour <gour atmarama.net> writes:
On Thu, 29 Dec 2011 04:15:27 -0600
Caligo <iteronvexor gmail.com> wrote:

 This is somewhat of a serious question:  If there is a God (I'm not
 saying there isn't, and I'm not saying there is),=20
There is. ;)
 It would be hard for us mortals to imagine, but would it resemble a
 functional programming language more or something else? =20
Just answer the following question: Are we mortals the result of pure function or just side-effect? Sincerely, Gour --=20 There are principles to regulate attachment and aversion pertaining to=20 the senses and their objects. One should not come under the control of=20 such attachment and aversion, because they are stumbling blocks on the=20 path of self-realization. http://atmarama.net | Hlapicina (Croatia) | GPG: 52B5C810
Dec 29 2011
next sibling parent Caligo <iteronvexor gmail.com> writes:
On Thu, Dec 29, 2011 at 4:40 AM, Gour <gour atmarama.net> wrote:

 Just answer the following question: Are we mortals the result of pure
 function or just side-effect?
You are asking about creationism and evolution, aren't you? I have to say that I don't know. Always trust the one who is looking for the truth, not the one who has found it. :-)
Dec 29 2011
prev sibling parent reply maarten van damme <maartenvd1994 gmail.com> writes:
I think it would be an object oriented language, I'm a believer in the
string theory :)
I have actually thought of the whole universe as one big simulation, would
really explain how light waves without medium (like a math function).

If I were god I would def use object oriented because it makes for easy
describing of different particles and strings. and I'm pretty sure there is
no garbage collector included in gods language :p
Dec 29 2011
parent reply "Nick Sabalausky" <a a.a> writes:
"maarten van damme" <maartenvd1994 gmail.com> wrote in message 
news:mailman.1985.1325157846.24802.digitalmars-d puremagic.com...
I think it would be an object oriented language, I'm a believer in the
 string theory :)
I heard on the Science Channel that M-theory was becoming favored over string therory. (Not that I would actually know.)
 I have actually thought of the whole universe as one big simulation, would
 really explain how light waves without medium (like a math function).
I came across a book one time that talked about the 'verse basically being one big quantum computer. I didn't actually red through it though, and I can't remember what it was called... :(
 If I were god I would def use object oriented because it makes for easy
 describing of different particles and strings. and I'm pretty sure there 
 is
 no garbage collector included in gods language :p
If I were god, then I'd presumably be omnipotent, and if I were omnipotent, then I'd be able to do it all in something like FuckFuck, or that shakesperian language, or that lolcat language without any difficulty. And I could just fix any limitations in the implementation. So that would seem the best option :)
Jan 02 2012
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 01/02/2012 09:00 PM, Nick Sabalausky wrote:
 "maarten van damme"<maartenvd1994 gmail.com>  wrote in message
 news:mailman.1985.1325157846.24802.digitalmars-d puremagic.com...
 I think it would be an object oriented language, I'm a believer in the
 string theory :)
I heard on the Science Channel that M-theory was becoming favored over string therory. (Not that I would actually know.)
 I have actually thought of the whole universe as one big simulation, would
 really explain how light waves without medium (like a math function).
I came across a book one time that talked about the 'verse basically being one big quantum computer. I didn't actually red through it though, and I can't remember what it was called... :(
 If I were god I would def use object oriented because it makes for easy
 describing of different particles and strings. and I'm pretty sure there
 is
 no garbage collector included in gods language :p
If I were god, then I'd presumably be omnipotent, and if I were omnipotent, then I'd be able to do it all in something like FuckFuck, or that shakesperian language, or that lolcat language without any difficulty. And I could just fix any limitations in the implementation. So that would seem the best option :)
God cannot be omnipotent. If he was, he could invent a task he cannot solve.
Jan 02 2012
next sibling parent Caligo <iteronvexor gmail.com> writes:
On Mon, Jan 2, 2012 at 4:29 PM, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 01/02/2012 09:00 PM, Nick Sabalausky wrote:

 "maarten van damme"<maartenvd1994 gmail.com**>  wrote in message
 news:mailman.1985.1325157846.**24802.digitalmars-d puremagic.**com...

 I think it would be an object oriented language, I'm a believer in the
 string theory :)
I heard on the Science Channel that M-theory was becoming favored over string therory. (Not that I would actually know.) I have actually thought of the whole universe as one big simulation,
 would
 really explain how light waves without medium (like a math function).
I came across a book one time that talked about the 'verse basically being one big quantum computer. I didn't actually red through it though, and I can't remember what it was called... :( If I were god I would def use object oriented because it makes for easy
 describing of different particles and strings. and I'm pretty sure there
 is
 no garbage collector included in gods language :p
If I were god, then I'd presumably be omnipotent, and if I were omnipotent, then I'd be able to do it all in something like FuckFuck, or that shakesperian language, or that lolcat language without any difficulty. And I could just fix any limitations in the implementation. So that would seem the best option :)
God cannot be omnipotent. If he was, he could invent a task he cannot solve.
He has; the human race.
Jan 02 2012
prev sibling parent reply Gour <gour atmarama.net> writes:
On Mon, 02 Jan 2012 23:29:17 +0100
Timon Gehr <timon.gehr gmx.ch> wrote:

 God cannot be omnipotent. If he was, he could invent a task he cannot
 solve.
Wrong. He is not static, but dynamic, so He can invent a task he cannot solve, but in the next moment he can solve it. ;) Sincerely, Gour --=20 When your intelligence has passed out of the dense forest=20 of delusion, you shall become indifferent to all that has=20 been heard and all that is to be heard. http://atmarama.net | Hlapicina (Croatia) | GPG: 52B5C810
Jan 02 2012
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 01/03/2012 08:26 AM, Gour wrote:
 On Mon, 02 Jan 2012 23:29:17 +0100
 Timon Gehr<timon.gehr gmx.ch>  wrote:

 God cannot be omnipotent. If he was, he could invent a task he cannot
 solve.
Wrong. He is not static, but dynamic, so He can invent a task he cannot solve, but in the next moment he can solve it. ;) Sincerely, Gour
I meant he can invent a task he will never be able to solve. ;)
Jan 02 2012
next sibling parent Gour <gour atmarama.net> writes:
On Tue, 03 Jan 2012 08:31:33 +0100
Timon Gehr <timon.gehr gmx.ch> wrote:

 I meant he can invent a task he will never be able to solve. ;)
Nah...those are just side-effects, iow. noise. :-D Sincerely, Gour --=20 But those who, out of envy, disregard these teachings and do not=20 follow them are to be considered bereft of all knowledge, befooled,=20 and ruined in their endeavors for perfection. http://atmarama.net | Hlapicina (Croatia) | GPG: 52B5C810
Jan 02 2012
prev sibling parent reply "Nick Sabalausky" <a a.a> writes:
"Timon Gehr" <timon.gehr gmx.ch> wrote in message 
news:jduasl$ndh$1 digitalmars.com...
 On 01/03/2012 08:26 AM, Gour wrote:
 On Mon, 02 Jan 2012 23:29:17 +0100
 Timon Gehr<timon.gehr gmx.ch>  wrote:

 God cannot be omnipotent. If he was, he could invent a task he cannot
 solve.
Wrong. He is not static, but dynamic, so He can invent a task he cannot solve, but in the next moment he can solve it. ;) Sincerely, Gour
I meant he can invent a task he will never be able to solve. ;)
I've never felt that argument to be particularly compelling: I see it as merely indicating that an omnipotent being is able to give up their own omnipotence. Which, being omnipotent, they'd of course have to be capable of doing. Of course, you could then try "Could he create a task he couldn't solve without giving up his own omnipotence?" But I think amounts to a logical contradiction akin to any other such as "Could an omnipotent being make a rock that isn't a rock?" And that's a whole other philosophical matter (ie, Do logical contradictions count as something an omnipotent being must be able to do?).
Jan 03 2012
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/3/2012 12:48 AM, Nick Sabalausky wrote:
 "Could an omnipotent being make a
 rock that isn't a rock?"
I don't know, but I'm sure he could make a product that is both a floor wax and a dessert topping.
Jan 03 2012
parent reply "Nick Sabalausky" <a a.a> writes:
"Walter Bright" <newshound2 digitalmars.com> wrote in message 
news:jdvgnr$2uer$1 digitalmars.com...
 On 1/3/2012 12:48 AM, Nick Sabalausky wrote:
 "Could an omnipotent being make a
 rock that isn't a rock?"
I don't know, but I'm sure he could make a product that is both a floor wax and a dessert topping.
I'm having visions of Billy Mays...
Jan 03 2012
parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/3/2012 10:25 AM, Nick Sabalausky wrote:
 "Walter Bright"<newshound2 digitalmars.com>  wrote in message
 news:jdvgnr$2uer$1 digitalmars.com...
 On 1/3/2012 12:48 AM, Nick Sabalausky wrote:
 "Could an omnipotent being make a
 rock that isn't a rock?"
I don't know, but I'm sure he could make a product that is both a floor wax and a dessert topping.
I'm having visions of Billy Mays...
Wrong reference! Google "floor wax and dessert topping".
Jan 03 2012
prev sibling next sibling parent J Arrizza <cppgent0 gmail.com> writes:
  and I'm pretty sure there is  no garbage collector included in gods
 language :p
Are you sure? There is good evidence he strongly prefers gc's. Consider almost all insects; consider dung beetles specifically. Consider super novas, gravity and accretion disks. Consider Disney and the Circle of Life. It's pretty clear he views automated recycling as a general architectural approach. A large benefit of a gc is it disassociates responsibility for cleanup from the creator of the object. Now imagine the opposite: after you died, you were responsible for disassembling yourself for use by others to create themselves (think "Soylent Green, The Next Generation"). And if you didn't do it, or you didn't do it properly, the world would eventually overcrowd and explode, leaving a core dump in space. Nice. Of course, he'd give himself a switch to turn off the gc when he really needed to. John
Jan 02 2012
prev sibling next sibling parent maarten van damme <maartenvd1994 gmail.com> writes:
2012/1/3 J Arrizza <cppgent0 gmail.com>

 Are you sure? There is good evidence he strongly prefers gc's. Consider
 almost all insects; consider dung beetles specifically. Consider super
 novas, gravity and accretion disks. Consider Disney and the Circle of Life.
 It's pretty clear he views automated recycling as a general architectural
 approach.
A large benefit of a gc is it disassociates responsibility for cleanup from the creator of the object. Now imagine the opposite: after you died, you were responsible for disassembling yourself for use by others to create themselves (think "Soylent Green, The Next Generation"). And if you didn't do it, or you didn't do it properly, the world would eventually overcrowd and explode, leaving a core dump in space. Nice.
 Of course, he'd give himself a switch to turn off the gc when he really
 needed to.
there is no destruction/creation going on, energy is constant at all times in a closed system. That's how I thought about it :) If it's constant anyway he wouldn't have to bother with a gc, would he?
I meant he can invent a task he will never be able to solve. ;)
this seems rather strange doesn't it? If something is able to do everything, he should be able to invent something he is not able to do. if he invented something he is not able to do, he can't do everything. One could therefore assume it is not possible to be able to do everything :D
Well, if you want to discuss string theory...

http://xkcd.com/171/
http://xkcd.com/397/

:)
great one, I really like the first one. It's really the essence of string theory in a way :)
Jan 03 2012
prev sibling parent J Arrizza <cppgent0 gmail.com> writes:
On Tue, Jan 3, 2012 at 2:36 AM, maarten van damme
<maartenvd1994 gmail.com>wrote:

 there is no destruction/creation going on, energy is constant at all times
 in a closed system. That's how I thought about it :)
 If it's constant anyway he wouldn't have to bother with a gc, would he?
I see. Something like "Matter is neither created nor destroyed...". But similarly memory is neither created nor destroyed. Unless of course you're talking about a god language that can create hardware at run-time: // make sure the power supply can handle the extra memory this.PowerSupply.currentCurrent()++; // ... don't forget extra bypass capacitance // and check the wiring just in case. Capacitor mycap = new Capacitor(0.47uF); this.PowerSupply.BypassCap.Add(mycap); assert(this.PowerSupply.PositiveRail..capacity > 2.1A); assert(this.PowerSupply.NegativeRail..capacity > 2.1A); // finally! Add the extra storage we need this.SDRAM.extend(1GB);
I meant he can invent a task he will never be able to solve. ;)
 this seems rather strange doesn't it?
 If something is able to do everything, he should be able to invent
 something he is not able to do. if he invented something he is not able to
 do, he can't do everything.
 One could therefore assume it is not possible to be able to do everything
 :D
Can an omnipotent being bypass logical syllogisms? Don't forget: *ALL* powerful means not just the physical stuff. If so, then your argument doesn't hold... or it does. More precisely, it holds and doesn't hold at the same time, until you open the box and Schrodinger's cat jumps out. Or doesn't. John
Jan 03 2012
prev sibling next sibling parent Jacob Carlborg <doob me.com> writes:
On 2011-12-29 11:15, Caligo wrote:
 On Thu, Dec 29, 2011 at 3:16 AM, Walter Bright
 <newshound2 digitalmars.com <mailto:newshound2 digitalmars.com>> wrote:

     http://pastebin.com/AtuzJqh0


 This is somewhat of a serious question:  If there is a God (I'm not
 saying there isn't, and I'm not saying there is), what language would he
 choose to create the universe?  It would be hard for us mortals to
 imagine, but would it resemble a functional programming language more or
 something else?  And what type of hardware would the code run on?  I
 mean, there are computations happening all around us, e.g., when an
 apple falls or planets circle the sun, etc, so what's performing all the
 computation?
Servers in the cloud of course :) -- /Jacob Carlborg
Dec 29 2011
prev sibling next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/29/11 4:15 AM, Caligo wrote:
 On Thu, Dec 29, 2011 at 3:16 AM, Walter Bright
 <newshound2 digitalmars.com <mailto:newshound2 digitalmars.com>> wrote:

     http://pastebin.com/AtuzJqh0


 This is somewhat of a serious question:  If there is a God (I'm not
 saying there isn't, and I'm not saying there is), what language would he
 choose to create the universe?  It would be hard for us mortals to
 imagine, but would it resemble a functional programming language more or
 something else?  And what type of hardware would the code run on?  I
 mean, there are computations happening all around us, e.g., when an
 apple falls or planets circle the sun, etc, so what's performing all the
 computation?
Obligatory: http://xkcd.com/224/ Andrei
Dec 29 2011
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/29/2011 2:15 AM, Caligo wrote:
 If there is a God (I'm not saying there
 isn't, and I'm not saying there is), what language would he choose to create
the
 universe?
Mathematics.
Dec 29 2011
next sibling parent so <so so.so> writes:
On Thu, 29 Dec 2011 20:27:43 +0200, Walter Bright  
<newshound2 digitalmars.com> wrote:

 On 12/29/2011 2:15 AM, Caligo wrote:
 If there is a God (I'm not saying there
 isn't, and I'm not saying there is), what language would he choose to  
 create the
 universe?
Mathematics.
In the essence and the spirit math is THE answer, but if you mean the implementation we have now, it is verbose nonsense. Yet we are talking about GODs language, you have a point! Only an immortal could comprehend math fully.
Dec 29 2011
prev sibling parent reply FeepingCreature <default_357-line yahoo.de> writes:
On 12/29/11 19:27, Walter Bright wrote:
 On 12/29/2011 2:15 AM, Caligo wrote:
 If there is a God (I'm not saying there
 isn't, and I'm not saying there is), what language would he choose to create
the
 universe?
Mathematics.
Fan of Tegmarkš, eh? :) -- šhttp://en.wikipedia.org/wiki/Mathematical_universe_hypothesis
Dec 29 2011
parent =?utf-8?Q?Simen_Kj=C3=A6r=C3=A5s?= <simen.kjaras gmail.com> writes:
On Thu, 29 Dec 2011 21:08:29 +0100, FeepingCreature  =

<default_357-line yahoo.de> wrote:

 On 12/29/11 19:27, Walter Bright wrote:
 On 12/29/2011 2:15 AM, Caligo wrote:
 If there is a God (I'm not saying there
 isn't, and I'm not saying there is), what language would he choose t=
o =
 create the
 universe?
Mathematics.
Fan of Tegmark=C2=B9, eh? :) -- =C2=B9http://en.wikipedia.org/wiki/Mathematical_universe_hypothesis
I love that one. My favorite is that it indicates the existence of a = boolean universe. I like to believe it is currently 'off'.
Jan 02 2012
prev sibling next sibling parent Don <nospam nospam.com> writes:
On 29.12.2011 11:15, Caligo wrote:
 On Thu, Dec 29, 2011 at 3:16 AM, Walter Bright
 <newshound2 digitalmars.com <mailto:newshound2 digitalmars.com>> wrote:

     http://pastebin.com/AtuzJqh0


 This is somewhat of a serious question:  If there is a God (I'm not
 saying there isn't, and I'm not saying there is), what language would he
 choose to create the universe?  It would be hard for us mortals to
 imagine, but would it resemble a functional programming language more or
 something else?  And what type of hardware would the code run on?  I
 mean, there are computations happening all around us, e.g., when an
 apple falls or planets circle the sun, etc, so what's performing all the
 computation?
Declarative. Program begins with void. Let there be <thing>.
Dec 29 2011
prev sibling parent bcs <bcs example.com> writes:
On 12/29/2011 02:15 AM, Caligo wrote:
 This is somewhat of a serious question:  If there is a God (I'm not
 saying there isn't, and I'm not saying there is), what language would he
 choose to create the universe?  It would be hard for us mortals to
 imagine, but would it resemble a functional programming language more or
 something else?  And what type of hardware would the code run on?  I
 mean, there are computations happening all around us, e.g., when an
 apple falls or planets circle the sun, etc, so what's performing all the
 computation?
I have two contradictory answers: Languages, Prolog. Hardware, something that can solve the hauling problem (but just for for turning machines).
Jan 04 2012
prev sibling next sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Thursday, 29 December 2011 at 09:16:23 UTC, Walter Bright 
wrote:
 Are you a ridiculous hacker? Inline x86 assembly that the 
 compiler actually understands in 32 AND 64 bit code, hex string 
 literals like x"DE ADB EEF" where spacing doesn't matter, the 
 ability to set data alignment cross-platform with type.alignof 
 = 16, load your shellcode verbatim into a string like so: auto 
 str = import("shellcode.txt");
I would like to talk about this for a bit. Personally, I think D's system programming abilities are only half-way there. Note that I am not talking about use cases in high-level application code, but rather low-level, widely-used framework code, where every bit of performance matters (for example: memory copy routines, string builders, garbage collectors). In-line assembler as part of the language is certainly neat, and in fact coming from Delphi to C++ I was surprised to learn that C++ implementations adopted different syntax for asm blocks. However, compared to some C++ compilers, it has severe limitations and is D's only trick in this alley. For one thing, there is no way to force the compiler to inline a function (like __forceinline / __attribute((always_inline)) ). This is fine for high-level code (where users are best left with PGO and "the compiler knows best"), but sucks if you need a guarantee that the function must be inlined. The guarantee isn't just about inlining heuristics, but also implementation capabilities. For example, some implementations might not be able to inline functions that use certain language features, and your code's performance could demand that such a short function must be inlined. One example of this is inlining functions containing asm blocks - IIRC DMD does not support this. The compiler should fail the build if it can't inline a function tagged with forceinline, instead of shrugging it off and failing silently, forcing users to check the disassembly every time. You may have noticed that GCC has some ridiculously complicated assembler facilities. However, they also open the way to the possibilities of writing optimal code - for example, creating custom calling conventions, or inlining assembler functions without restricting the caller's register allocation with a predetermined calling convention. In contrast, DMD is very conservative when it comes to mixing D and assembler. One time I found that putting an asm block in a function turned what were single instructions into blocks of 6 instructions each. D's lacking in this area makes it impossible to create language features that are on the level of D's compiler built-ins. For example, I have tested three memcpy implementations recently, but none of them could beat DMD's standard array slice copy (despite that in release mode it compiles to a simple memcpy call). Why? Because the overhead of using a custom memcpy routine negated its performance gains. This might have been alleviated with the presence of sane macros, but no such luck. String mixins are not the answer: trying to translate macro-heavy C code to D using string mixins is string escape hell, and we're back to the level of shell scripts. We've discussed this topic on IRC recently. From what I understood, Andrei thinks improvements in this area are not "impactful" enough, which I find worrisome. Personally, I don't think D qualifies as a true "system programming language" in light of the above. It's more of a compiled language with pointers and assembler. Before you disagree with any of the above, first (for starters) I'd like to invite you to translate Daniel Vik's C memcpy implementation to D: http://www.danielvik.com/2010/02/fast-memcpy-in-c.html . It doesn't even use inline assembler or compiler intrinsics.
Dec 29 2011
next sibling parent reply =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <xtzgzorex gmail.com> writes:
On 29-12-2011 12:19, Vladimir Panteleev wrote:
 On Thursday, 29 December 2011 at 09:16:23 UTC, Walter Bright wrote:
 Are you a ridiculous hacker? Inline x86 assembly that the compiler
 actually understands in 32 AND 64 bit code, hex string literals like
 x"DE ADB EEF" where spacing doesn't matter, the ability to set data
 alignment cross-platform with type.alignof = 16, load your shellcode
 verbatim into a string like so: auto str = import("shellcode.txt");
I would like to talk about this for a bit. Personally, I think D's system programming abilities are only half-way there. Note that I am not talking about use cases in high-level application code, but rather low-level, widely-used framework code, where every bit of performance matters (for example: memory copy routines, string builders, garbage collectors). In-line assembler as part of the language is certainly neat, and in fact coming from Delphi to C++ I was surprised to learn that C++ implementations adopted different syntax for asm blocks. However, compared to some C++ compilers, it has severe limitations and is D's only trick in this alley. For one thing, there is no way to force the compiler to inline a function (like __forceinline / __attribute((always_inline)) ). This is fine for high-level code (where users are best left with PGO and "the compiler knows best"), but sucks if you need a guarantee that the function must be inlined. The guarantee isn't just about inlining heuristics, but also implementation capabilities. For example, some implementations might not be able to inline functions that use certain language features, and your code's performance could demand that such a short function must be inlined. One example of this is inlining functions containing asm blocks - IIRC DMD does not support this. The compiler should fail the build if it can't inline a function tagged with forceinline, instead of shrugging it off and failing silently, forcing users to check the disassembly every time. You may have noticed that GCC has some ridiculously complicated assembler facilities. However, they also open the way to the possibilities of writing optimal code - for example, creating custom calling conventions, or inlining assembler functions without restricting the caller's register allocation with a predetermined calling convention. In contrast, DMD is very conservative when it comes to mixing D and assembler. One time I found that putting an asm block in a function turned what were single instructions into blocks of 6 instructions each. D's lacking in this area makes it impossible to create language features that are on the level of D's compiler built-ins. For example, I have tested three memcpy implementations recently, but none of them could beat DMD's standard array slice copy (despite that in release mode it compiles to a simple memcpy call). Why? Because the overhead of using a custom memcpy routine negated its performance gains. This might have been alleviated with the presence of sane macros, but no such luck. String mixins are not the answer: trying to translate macro-heavy C code to D using string mixins is string escape hell, and we're back to the level of shell scripts. We've discussed this topic on IRC recently. From what I understood, Andrei thinks improvements in this area are not "impactful" enough, which I find worrisome. Personally, I don't think D qualifies as a true "system programming language" in light of the above. It's more of a compiled language with pointers and assembler. Before you disagree with any of the above, first (for starters) I'd like to invite you to translate Daniel Vik's C memcpy implementation to D: http://www.danielvik.com/2010/02/fast-memcpy-in-c.html . It doesn't even use inline assembler or compiler intrinsics.
+1. D needs a way to force inlining. The compiler can, at best, do heuristics. If D wants to cater to systems programmers -- that is, programmers who *know their shit* -- it needs advanced features like this. Same reason we have __gshared, for example. - Alex
Dec 29 2011
parent reply so <so so.so> writes:
On Thu, 29 Dec 2011 13:44:12 +0200, Alex R=C3=B8nne Petersen  =

<xtzgzorex gmail.com> wrote:

 +1. D needs a way to force inlining. The compiler can, at best, do  =
 heuristics. If D wants to cater to systems programmers -- that is,  =
 programmers who *know their shit* -- it needs advanced features like  =
 this. Same reason we have __gshared, for example.

 - Alex
The legitimate "D performs so bad in my example" posts appeared in this = = forum almost always ended up with the conclusion that D's lack a controlled = inline mechanism.
Dec 29 2011
parent Walter Bright <newshound2 digitalmars.com> writes:
On 12/29/2011 9:15 AM, so wrote:
 The legitimate "D performs so bad in my example" posts appeared in this forum
 almost always ended up with the conclusion that D's lack a controlled inline
 mechanism.
Standard C doesn't have one either. C vendors often implement vendor-specific extensions for this.
Dec 29 2011
prev sibling next sibling parent reply Kapps <Kapps NotValidEmail.com> writes:
Agreed.

There are plenty of real-world, even 'common' examples where the lack of 
being able to force inlining for a function is a problem. The main one 
I've run into is not being able to inline functions with assembly, thus 
not being able to implement efficient SIMD operations.
Dec 29 2011
parent reply a <a a.com> writes:
Kapps Wrote:

 Agreed.
 
 There are plenty of real-world, even 'common' examples where the lack of 
 being able to force inlining for a function is a problem. The main one 
 I've run into is not being able to inline functions with assembly, thus 
 not being able to implement efficient SIMD operations.
The problem is not just inlining but also needless loads and stores at the beginnings and ends of asm blocks. For example in the following code: void test(ref V a, ref V b) { asm { movaps XMM0, a; addps XMM0, b; movaps a, XMM0; } asm { movaps XMM0, a; addps XMM0, b; movaps a, XMM0; } } compiles to: 0: 55 push %rbp 1: 48 8b ec mov %rsp,%rbp 4: 48 83 ec 10 sub $0x10,%rsp 8: 48 89 7d f0 mov %rdi,-0x10(%rbp) c: 48 89 75 f8 mov %rsi,-0x8(%rbp) 10: 0f 28 45 f8 movaps -0x8(%rbp),%xmm0 14: 0f 58 45 f0 addps -0x10(%rbp),%xmm0 18: 0f 29 45 f8 movaps %xmm0,-0x8(%rbp) 1c: 0f 28 45 f8 movaps -0x8(%rbp),%xmm0 20: 0f 58 45 f0 addps -0x10(%rbp),%xmm0 24: 0f 29 45 f8 movaps %xmm0,-0x8(%rbp) 28: 48 8b e5 mov %rbp,%rsp 2b: 5d pop %rbp 2c: c3 retq The needles loads and stores would make it impossible to write an efficient simd add function even if the functions containing asm blocks could be inlined.
Dec 29 2011
next sibling parent reply David Nadlinger <see klickverbot.at> writes:
On 12/29/11 2:13 PM, a wrote:
 void test(ref V a, ref V b)
 {
      asm
      {
          movaps XMM0, a;
          addps  XMM0, b;
          movaps a, XMM0;
      }
      asm
      {
          movaps XMM0, a;
          addps  XMM0, b;
          movaps a, XMM0;
      }
 }

 [â€Ļ]

 The needles loads and stores would make it impossible to write an efficient
simd add function even if the functions containing asm blocks could be inlined.
Yes, this is indeed a problem, and as far as I'm aware, usually solved in the gamedev world by using the (SSE) intrinsics your favorite C++ compiler provides, instead of resorting to inline asm. David
Dec 29 2011
next sibling parent Paulo Pinto <pjmlp progtools.org> writes:
Specially because some 64 bit compilers are providing intrinsics as the only
way to access the processor.

Visual C++ for example, does not provide inline assembly support.

David Nadlinger Wrote:

 On 12/29/11 2:13 PM, a wrote:
 void test(ref V a, ref V b)
 {
      asm
      {
          movaps XMM0, a;
          addps  XMM0, b;
          movaps a, XMM0;
      }
      asm
      {
          movaps XMM0, a;
          addps  XMM0, b;
          movaps a, XMM0;
      }
 }

 [â€Ļ]

 The needles loads and stores would make it impossible to write an efficient
simd add function even if the functions containing asm blocks could be inlined.
Yes, this is indeed a problem, and as far as I'm aware, usually solved in the gamedev world by using the (SSE) intrinsics your favorite C++ compiler provides, instead of resorting to inline asm. David
Dec 29 2011
prev sibling parent a <a a.com> writes:
David Nadlinger Wrote:

 On 12/29/11 2:13 PM, a wrote:
 void test(ref V a, ref V b)
 {
      asm
      {
          movaps XMM0, a;
          addps  XMM0, b;
          movaps a, XMM0;
      }
      asm
      {
          movaps XMM0, a;
          addps  XMM0, b;
          movaps a, XMM0;
      }
 }

 [â€Ļ]

 The needles loads and stores would make it impossible to write an efficient
simd add function even if the functions containing asm blocks could be inlined.
Yes, this is indeed a problem, and as far as I'm aware, usually solved in the gamedev world by using the (SSE) intrinsics your favorite C++ compiler provides, instead of resorting to inline asm. David
IIRC Walter doesn't want to add vector intrinsics, so it would be nice if the functions to do vector operations could be efficiently written using inline assembly. It would also be a more general solution than having intrinsics. Something like that is possible with gcc extended inline assembly. For example this: typedef float v4sf __attribute__((vector_size(16))); void vadd(v4sf *a, v4sf *b) { asm( "addps %1, %0" : "=x" (*a) : "x" (*b), "0" (*a) : ); } void test(float * __restrict__ a, float * __restrict__ b) { v4sf * va = (v4sf*) a; v4sf * vb = (v4sf*) b; vadd(va,vb); vadd(va,vb); vadd(va,vb); vadd(va,vb); } compiles to: 00000000004004c0 <test>: 4004c0: 0f 28 0e movaps (%rsi),%xmm1 4004c3: 0f 28 07 movaps (%rdi),%xmm0 4004c6: 0f 58 c1 addps %xmm1,%xmm0 4004c9: 0f 58 c1 addps %xmm1,%xmm0 4004cc: 0f 58 c1 addps %xmm1,%xmm0 4004cf: 0f 58 c1 addps %xmm1,%xmm0 4004d2: 0f 29 07 movaps %xmm0,(%rdi) This should also be possible with GDC, but I couldn't figure out how to get something like __restrict__ (if you want to use vector types and gcc extended inline assembly with GDC, see http://www.digitalmars.com/d/archives/D/gnu/Support_for_gcc_vector_attributes_SIM _builtins_3778.html and https://bitbucket.org/goshawk/gdc/wiki/UserDocumentation).
Dec 29 2011
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/29/2011 5:13 AM, a wrote:
 The needles loads and stores would make it impossible to write an efficient
 simd add function even if the functions containing asm blocks could be
 inlined.
This does what you're asking for: void test(ref float a, ref float b) { asm { naked; movaps XMM0,[RSI]; addps XMM0,[RDI]; movaps [RSI],XMM0; movaps XMM0,[RSI]; addps XMM0,[RDI]; movaps [RSI],XMM0; ret; } }
Dec 29 2011
parent reply a <a a.com> writes:
Walter Bright Wrote:

 On 12/29/2011 5:13 AM, a wrote:
 The needles loads and stores would make it impossible to write an efficient
 simd add function even if the functions containing asm blocks could be
 inlined.
This does what you're asking for: void test(ref float a, ref float b) { asm { naked; movaps XMM0,[RSI]; addps XMM0,[RDI]; movaps [RSI],XMM0; movaps XMM0,[RSI]; addps XMM0,[RDI]; movaps [RSI],XMM0; ret; } }
What I want is to be able to write short functions using inline assembly and have them inlined and compiled even to a single instruction where possible. This can be done with gcc. See my post here: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=153879
Dec 29 2011
parent Walter Bright <newshound2 digitalmars.com> writes:
On 12/29/2011 2:52 PM, a wrote:
 What I want is to be able to write short functions using inline assembly and
 have them inlined and compiled even to a single instruction where possible.
 This can be done with gcc. See my post here:
 http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=153879
I understand. I just wished to make sure you knew about 'naked' and what good it was for.
Dec 29 2011
prev sibling next sibling parent Peter Alexander <peter.alexander.au gmail.com> writes:
On 29/12/11 11:19 AM, Vladimir Panteleev wrote:
 On Thursday, 29 December 2011 at 09:16:23 UTC, Walter Bright wrote:
 Are you a ridiculous hacker? Inline x86 assembly that the compiler
 actually understands in 32 AND 64 bit code, hex string literals like
 x"DE ADB EEF" where spacing doesn't matter, the ability to set data
 alignment cross-platform with type.alignof = 16, load your shellcode
 verbatim into a string like so: auto str = import("shellcode.txt");
I would like to talk about this for a bit. Personally, I think D's system programming abilities are only half-way there. Note that I am not talking about use cases in high-level application code, but rather low-level, widely-used framework code, where every bit of performance matters (for example: memory copy routines, string builders, garbage collectors). In-line assembler as part of the language is certainly neat, and in fact coming from Delphi to C++ I was surprised to learn that C++ implementations adopted different syntax for asm blocks. However, compared to some C++ compilers, it has severe limitations and is D's only trick in this alley. For one thing, there is no way to force the compiler to inline a function (like __forceinline / __attribute((always_inline)) ). This is fine for high-level code (where users are best left with PGO and "the compiler knows best"), but sucks if you need a guarantee that the function must be inlined. The guarantee isn't just about inlining heuristics, but also implementation capabilities. For example, some implementations might not be able to inline functions that use certain language features, and your code's performance could demand that such a short function must be inlined. One example of this is inlining functions containing asm blocks - IIRC DMD does not support this. The compiler should fail the build if it can't inline a function tagged with forceinline, instead of shrugging it off and failing silently, forcing users to check the disassembly every time. You may have noticed that GCC has some ridiculously complicated assembler facilities. However, they also open the way to the possibilities of writing optimal code - for example, creating custom calling conventions, or inlining assembler functions without restricting the caller's register allocation with a predetermined calling convention. In contrast, DMD is very conservative when it comes to mixing D and assembler. One time I found that putting an asm block in a function turned what were single instructions into blocks of 6 instructions each. D's lacking in this area makes it impossible to create language features that are on the level of D's compiler built-ins. For example, I have tested three memcpy implementations recently, but none of them could beat DMD's standard array slice copy (despite that in release mode it compiles to a simple memcpy call). Why? Because the overhead of using a custom memcpy routine negated its performance gains. This might have been alleviated with the presence of sane macros, but no such luck. String mixins are not the answer: trying to translate macro-heavy C code to D using string mixins is string escape hell, and we're back to the level of shell scripts. We've discussed this topic on IRC recently. From what I understood, Andrei thinks improvements in this area are not "impactful" enough, which I find worrisome. Personally, I don't think D qualifies as a true "system programming language" in light of the above. It's more of a compiled language with pointers and assembler. Before you disagree with any of the above, first (for starters) I'd like to invite you to translate Daniel Vik's C memcpy implementation to D: http://www.danielvik.com/2010/02/fast-memcpy-in-c.html . It doesn't even use inline assembler or compiler intrinsics.
+1 Also: vector instrinsics. Also: alignment specifications (not just member variables). The lack of both these things is currently causing me much pain :-( Manually aligning things gets tiresome after a while.
Dec 29 2011
prev sibling next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Vladimir Panteleev:

 One example of this is inlining functions containing 
 asm blocks - IIRC DMD does not support this. The compiler should 
 fail the build if it can't inline a function tagged with 
  forceinline, instead of shrugging it off and failing silently, 
 forcing users to check the disassembly every time.
Right.
 You may have noticed that GCC has some ridiculously complicated 
 assembler facilities. However, they also open the way to the 
 possibilities of writing optimal code - for example, creating 
 custom calling conventions, or inlining assembler functions 
 without restricting the caller's register allocation with a 
 predetermined calling convention. In contrast, DMD is very 
 conservative when it comes to mixing D and assembler. One time I 
 found that putting an asm block in a function turned what were 
 single instructions into blocks of 6 instructions each.
LDC has a mean to inline functions with asm, and asm expressions. DMD too should have both. I am saying this since two or three years. Bye, bearophile
Dec 29 2011
prev sibling next sibling parent reply Don <nospam nospam.com> writes:
On 29.12.2011 12:19, Vladimir Panteleev wrote:
 On Thursday, 29 December 2011 at 09:16:23 UTC, Walter Bright wrote:
 Are you a ridiculous hacker? Inline x86 assembly that the compiler
 actually understands in 32 AND 64 bit code, hex string literals like
 x"DE ADB EEF" where spacing doesn't matter, the ability to set data
 alignment cross-platform with type.alignof = 16, load your shellcode
 verbatim into a string like so: auto str = import("shellcode.txt");
I would like to talk about this for a bit. Personally, I think D's system programming abilities are only half-way there. Note that I am not talking about use cases in high-level application code, but rather low-level, widely-used framework code, where every bit of performance matters (for example: memory copy routines, string builders, garbage collectors). In-line assembler as part of the language is certainly neat, and in fact coming from Delphi to C++ I was surprised to learn that C++ implementations adopted different syntax for asm blocks. However, compared to some C++ compilers, it has severe limitations and is D's only trick in this alley. For one thing, there is no way to force the compiler to inline a function (like __forceinline / __attribute((always_inline)) ).
[snip]
 Personally, I don't think D qualifies as a true "system programming
 language" in light of the above. It's more of a compiled language with
 pointers and assembler.
I don't think the situation is any different with DMC. I think that if D isn't a systems programming lanugage, neither is C or C++ without vendor-specific extensions. But it doesn't really matter -- the main conclusion is still correct: D is missing some features which could improve performance considerably. Before you disagree with any of the above, first
 (for starters) I'd like to invite you to translate Daniel Vik's C memcpy
 implementation to D:
 http://www.danielvik.com/2010/02/fast-memcpy-in-c.html . It doesn't even
 use inline assembler or compiler intrinsics.
Note that the memcpy described there is _far_ from optimal. Memcpy is all about cache effciency. DMD translates memcpy to the single instruction "rep movsd" which you'd think would be optimal, but you can actually beat it by a factor of four or more for long lengths.
Dec 29 2011
next sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Thursday, 29 December 2011 at 14:44:45 UTC, Don wrote:
 http://www.danielvik.com/2010/02/fast-memcpy-in-c.html . It 
 doesn't even
 use inline assembler or compiler intrinsics.
Note that the memcpy described there is _far_ from optimal. Memcpy is all about cache effciency. DMD translates memcpy to the single instruction "rep movsd" which you'd think would be optimal, but you can actually beat it by a factor of four or more for long lengths.
I've never seen DMD emit rep movsd. Does rep movsd even make sense when the memory areas do not have the same alignment? memcpy in snn.lib has a rep movsd instruction, but there's lots of other code (including what looks like Duff's device).
Dec 29 2011
parent Don <nospam nospam.com> writes:
On 29.12.2011 16:07, Vladimir Panteleev wrote:
 On Thursday, 29 December 2011 at 14:44:45 UTC, Don wrote:
 http://www.danielvik.com/2010/02/fast-memcpy-in-c.html . It doesn't even
 use inline assembler or compiler intrinsics.
Note that the memcpy described there is _far_ from optimal. Memcpy is all about cache effciency. DMD translates memcpy to the single instruction "rep movsd" which you'd think would be optimal, but you can actually beat it by a factor of four or more for long lengths.
I've never seen DMD emit rep movsd. Does rep movsd even make sense when the memory areas do not have the same alignment? memcpy in snn.lib has a rep movsd instruction, but there's lots of other code (including what looks like Duff's device).
It's in the backend in cod2.c, line 3260. But on closer inspection -- you're right! It's in an if(0 && ...) block. So it never does it, even when everything's aligned. There's a _huge_ potential for improvement in that function.
Dec 29 2011
prev sibling next sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Thursday, 29 December 2011 at 14:44:45 UTC, Don wrote:
 I don't think the situation is any different with DMC. I think 
 that if D isn't a systems programming lanugage, neither is C or 
 C++ without vendor-specific extensions.
You're right... I've never extensively used a C/C++ compiler without similar extensions, though. The fact that major vendors come up with their own extensions to do many of the same features shows that they might have better been standardized.
Dec 29 2011
next sibling parent so <so so.so> writes:
On Thu, 29 Dec 2011 17:20:22 +0200, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Thursday, 29 December 2011 at 14:44:45 UTC, Don wrote:
 I don't think the situation is any different with DMC. I think that if  
 D isn't a systems programming lanugage, neither is C or C++ without  
 vendor-specific extensions.
You're right... I've never extensively used a C/C++ compiler without similar extensions, though. The fact that major vendors come up with their own extensions to do many of the same features shows that they might have better been standardized.
Well i remember at most one or two supported me when i brought it up and Walter dismissed instantly.
Dec 29 2011
prev sibling parent bearophile <bearophileHUGS lycos.com> writes:
Vladimir Panteleev:

 The fact that major vendors 
 come up with their own extensions to do many of the same features 
 shows that they might have better been standardized.
Right. (This is why once I have asked for a explicitly not implemented computed gotos, to have them in D standard despite DMD doesn't implement them (LDC/GDC are probably able implement them quickly)). On the other hand D2 already makes standard several of the non-standard features of GNU C. Bye, bearophile
Dec 29 2011
prev sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Thursday, 29 December 2011 at 14:44:45 UTC, Don wrote:
 I don't think the situation is any different with DMC. I think 
 that if D isn't a systems programming lanugage, neither is C or 
 C++ without vendor-specific extensions.
C macros are a crude form of inlining. String mixins do not scale well in the same way as C macros (e.g. in the way they're used in said memcpy implementation).
Dec 29 2011
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/29/2011 3:19 AM, Vladimir Panteleev wrote:
 I'd like to invite you to translate Daniel Vik's C memcpy implementation to D:
 http://www.danielvik.com/2010/02/fast-memcpy-in-c.html
Challenge accepted. ------------------------ /******************************************************************** ** File: memcpy.c ** ** Copyright (C) 1999-2010 Daniel Vik ** ** This software is provided 'as-is', without any express or implied ** warranty. In no event will the authors be held liable for any ** damages arising from the use of this software. ** Permission is granted to anyone to use this software for any ** purpose, including commercial applications, and to alter it and ** redistribute it freely, subject to the following restrictions: ** ** 1. The origin of this software must not be misrepresented; you ** must not claim that you wrote the original software. If you ** use this software in a product, an acknowledgment in the ** use this software in a product, an acknowledgment in the ** product documentation would be appreciated but is not ** required. ** ** 2. Altered source versions must be plainly marked as such, and ** must not be misrepresented as being the original software. ** ** 3. This notice may not be removed or altered from any source ** distribution. ** ** ** Description: Implementation of the standard library function memcpy. ** This implementation of memcpy() is ANSI-C89 compatible. ** ** The following configuration options can be set: ** ** LITTLE_ENDIAN - Uses processor with little endian ** addressing. Default is big endian. ** ** PRE_INC_PTRS - Use pre increment of pointers. ** Default is post increment of ** pointers. ** ** INDEXED_COPY - Copying data using array indexing. ** Using this option, disables the ** PRE_INC_PTRS option. ** ** MEMCPY_64BIT - Compiles memcpy for 64 bit ** architectures ** ** ** Best Settings: ** ** Intel x86: LITTLE_ENDIAN and INDEXED_COPY ** *******************************************************************/ module memcpy; /******************************************************************** ** Configuration definitions. *******************************************************************/ version = LITTLE_ENDIAN; version = INDEXED_COPY; /******************************************************************** ** Includes for size_t definition *******************************************************************/ /******************************************************************** ** Typedefs *******************************************************************/ alias ubyte UInt8; alias ushort UInt16; alias uint UInt32; alias ulong UInt64; version (D_LP64) { alias UInt64 UIntN; enum TYPE_WIDTH = 8; } else { alias UInt32 UIntN; enum TYPE_WIDTH = 4; } /******************************************************************** ** Remove definitions when INDEXED_COPY is defined. *******************************************************************/ //#if defined (INDEXED_COPY) //#if defined (PRE_INC_PTRS) //#undef PRE_INC_PTRS //#endif /*PRE_INC_PTRS*/ //#endif /*INDEXED_COPY*/ /******************************************************************** ** Definitions for pre and post increment of pointers. *******************************************************************/ version (PRE_INC_PTRS) { void START_VAL(ref UInt8* x) { x--; } ref T INC_VAL(T)(ref T* x) { return *++x; } UInt8* CAST_TO_U8(void* p, int o) { return cast(UInt8*)p + o + TYPE_WIDTH; } enum WHILE_DEST_BREAK = (TYPE_WIDTH - 1); enum PRE_LOOP_ADJUST = -(TYPE_WIDTH - 1); enum PRE_SWITCH_ADJUST = 1; } else { void START_VAL(UInt8* x) { } ref T INC_VAL(T)(ref T* x) { return *x++; } UInt8* CAST_TO_U8(void* p, int o) { return cast(UInt8*)p + o; } enum WHILE_DEST_BREAK = 0; enum PRE_LOOP_ADJUST = 0; enum PRE_SWITCH_ADJUST = 0; } /******************************************************************** ** ** void *memcpy(void *dest, const void *src, size_t count) ** ** Args: dest - pointer to destination buffer ** src - pointer to source buffer ** count - number of bytes to copy ** ** Return: A pointer to destination buffer ** ** Purpose: Copies count bytes from src to dest. ** No overlap check is performed. ** *******************************************************************/ void *memcpy(void *dest, const void *src, size_t count) { auto dst8 = cast(UInt8*)dest; auto src8 = cast(UInt8*)src; UIntN* dstN; UIntN* srcN; UIntN dstWord; UIntN srcWord; /******************************************************************** ** Macros for copying words of different alignment. ** Uses incremening pointers. *******************************************************************/ void CP_INCR() { INC_VAL(dstN) = INC_VAL(srcN); } void CP_INCR_SH(int shl, int shr) { version (LITTLE_ENDIAN) { dstWord = srcWord >> shl; srcWord = INC_VAL(srcN); dstWord |= srcWord << shr; INC_VAL(dstN) = dstWord; } else { dstWord = srcWord << shl; srcWord = INC_VAL(srcN); dstWord |= srcWord >> shr; INC_VAL(dstN) = dstWord; } } /******************************************************************** ** Macros for copying words of different alignment. ** Uses array indexes. *******************************************************************/ void CP_INDEX(size_t idx) { dstN[idx] = srcN[idx]; } void CP_INDEX_SH(size_t x, int shl, int shr) { version (LITTLE_ENDIAN) { dstWord = srcWord >> shl; srcWord = srcN[x]; dstWord |= srcWord << shr; dstN[x] = dstWord; } else { dstWord = srcWord << shl; srcWord = srcN[x]; dstWord |= srcWord >> shr; dstN[x] = dstWord; } } /******************************************************************** ** Macros for copying words of different alignment. ** Uses incremening pointers or array indexes depending on ** configuration. *******************************************************************/ version (INDEXED_COPY) { void CP(size_t idx) { CP_INDEX(idx); } void CP_SH(size_t idx, int shl, int shr) { CP_INDEX_SH(idx, shl, shr); } void INC_INDEX(T)(ref T* p, size_t o) { p += o; } } else { void CP(size_t idx) { CP_INCR(); } void CP_SH(size_t idx, int shl, int shr) { CP_INCR_SH(shl, shr); } void INC_INDEX(T)(T* p, size_t o) { } } void COPY_REMAINING(size_t count) { START_VAL(dst8); START_VAL(src8); switch (count) { case 7: INC_VAL(dst8) = INC_VAL(src8); case 6: INC_VAL(dst8) = INC_VAL(src8); case 5: INC_VAL(dst8) = INC_VAL(src8); case 4: INC_VAL(dst8) = INC_VAL(src8); case 3: INC_VAL(dst8) = INC_VAL(src8); case 2: INC_VAL(dst8) = INC_VAL(src8); case 1: INC_VAL(dst8) = INC_VAL(src8); case 0: default: break; } } void COPY_NO_SHIFT() { dstN = cast(UIntN*)(dst8 + PRE_LOOP_ADJUST); srcN = cast(UIntN*)(src8 + PRE_LOOP_ADJUST); size_t length = count / TYPE_WIDTH; while (length & 7) { CP_INCR(); length--; } length /= 8; while (length--) { CP(0); CP(1); CP(2); CP(3); CP(4); CP(5); CP(6); CP(7); INC_INDEX(dstN, 8); INC_INDEX(srcN, 8); } src8 = CAST_TO_U8(srcN, 0); dst8 = CAST_TO_U8(dstN, 0); COPY_REMAINING(count & (TYPE_WIDTH - 1)); } void COPY_SHIFT(int shift) { dstN = cast(UIntN*)(((cast(UIntN)dst8) + PRE_LOOP_ADJUST) & ~(TYPE_WIDTH - 1)); srcN = cast(UIntN*)(((cast(UIntN)src8) + PRE_LOOP_ADJUST) & ~(TYPE_WIDTH - 1)); size_t length = count / TYPE_WIDTH; srcWord = INC_VAL(srcN); while (length & 7) { CP_INCR_SH(8 * shift, 8 * (TYPE_WIDTH - shift)); length--; } length /= 8; while (length--) { CP_SH(0, 8 * shift, 8 * (TYPE_WIDTH - shift)); CP_SH(1, 8 * shift, 8 * (TYPE_WIDTH - shift)); CP_SH(2, 8 * shift, 8 * (TYPE_WIDTH - shift)); CP_SH(3, 8 * shift, 8 * (TYPE_WIDTH - shift)); CP_SH(4, 8 * shift, 8 * (TYPE_WIDTH - shift)); CP_SH(5, 8 * shift, 8 * (TYPE_WIDTH - shift)); CP_SH(6, 8 * shift, 8 * (TYPE_WIDTH - shift)); CP_SH(7, 8 * shift, 8 * (TYPE_WIDTH - shift)); INC_INDEX(dstN, 8); INC_INDEX(srcN, 8); } src8 = CAST_TO_U8(srcN, (shift - TYPE_WIDTH)); dst8 = CAST_TO_U8(dstN, 0); COPY_REMAINING(count & (TYPE_WIDTH - 1)); } if (count < 8) { COPY_REMAINING(count); return dest; } START_VAL(dst8); START_VAL(src8); while ((cast(UIntN)dst8 & (TYPE_WIDTH - 1)) != WHILE_DEST_BREAK) { INC_VAL(dst8) = INC_VAL(src8); count--; } final switch (((cast(UIntN)src8) + PRE_SWITCH_ADJUST) & (TYPE_WIDTH - 1)) { case 0: COPY_NO_SHIFT(); break; case 1: COPY_SHIFT(1); break; case 2: COPY_SHIFT(2); break; case 3: COPY_SHIFT(3); break; static if (TYPE_WIDTH >= 4) { case 4: COPY_SHIFT(4); break; case 5: COPY_SHIFT(5); break; case 6: COPY_SHIFT(6); break; case 7: COPY_SHIFT(7); break; } } return dest; }
Dec 29 2011
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/29/11 1:47 PM, Walter Bright wrote:
 On 12/29/2011 3:19 AM, Vladimir Panteleev wrote:
 I'd like to invite you to translate Daniel Vik's C memcpy
 implementation to D:
 http://www.danielvik.com/2010/02/fast-memcpy-in-c.html
Challenge accepted.
[snip] Benchmarks? Andrei
Dec 29 2011
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/29/2011 11:47 AM, Walter Bright wrote:
 On 12/29/2011 3:19 AM, Vladimir Panteleev wrote:
 I'd like to invite you to translate Daniel Vik's C memcpy implementation to D:
 http://www.danielvik.com/2010/02/fast-memcpy-in-c.html
Challenge accepted.
This does compile, though I did not test or benchmark it. Examining the assembler output, it inlines everything except COPY_SHIFT, COPY_NO_SHIFT, and COPY_REMAINING. The inliner in dmd could definitely be improved, but that is not a problem with the language, but the implementation. Continuing in that vein, please note that neither C nor C++ require inlining of any sort. The "inline" keyword is merely a hint to the compiler. What inlining takes place is completely implementation defined, not language defined. The same goes for all those language extensions you mentioned. Those are not part of Standard C. They are vendor extensions. Does that mean that C is not actually a systems language? No. I wish to note that the D version semantically accomplishes the same thing as the C version without using mixins or CTFE - it's all straightforward code, without the abusive preprocessor tricks.
Dec 29 2011
parent reply so <so so.so> writes:
On Thu, 29 Dec 2011 22:00:12 +0200, Walter Bright  
<newshound2 digitalmars.com> wrote:

 Examining the assembler output, it inlines everything except COPY_SHIFT,  
 COPY_NO_SHIFT, and COPY_REMAINING. The inliner in dmd could definitely  
 be improved, but that is not a problem with the language, but the  
 implementation.

 Continuing in that vein, please note that neither C nor C++ require  
 inlining of any sort. The "inline" keyword is merely a hint to the  
 compiler. What inlining takes place is completely implementation  
 defined, not language defined.

 The same goes for all those language extensions you mentioned. Those are  
 not part of Standard C. They are vendor extensions. Does that mean that  
 C is not actually a systems language? No.

 I wish to note that the D version semantically accomplishes the same  
 thing as the C version without using mixins or CTFE - it's all  
 straightforward code, without the abusive preprocessor tricks.
Yet every big C/C++ compiler has to support it, no? Lets forget D for a second. Will you, as a compiler vendor support controlled inline in DMD with an extension? Or let me try another way, will you "let" community to do it?
Dec 29 2011
parent Walter Bright <newshound2 digitalmars.com> writes:
On 12/29/2011 12:23 PM, so wrote:
 Yet every big C/C++ compiler has to support it, no?
 Lets forget D for a second.
 Will you, as a compiler vendor support controlled inline in DMD with an
extension?
 Or let me try another way, will you "let" community to do it?
You can do a pull request for it, and we can evaluate it.
Dec 29 2011
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/29/2011 11:47 AM, Walter Bright wrote:
 On 12/29/2011 3:19 AM, Vladimir Panteleev wrote:
 I'd like to invite you to translate Daniel Vik's C memcpy implementation to D:
 http://www.danielvik.com/2010/02/fast-memcpy-in-c.html
Challenge accepted.
Here's another version that uses string mixins to ensure inlining of the COPY functions. There are no call instructions in the generated code. This should be as good as the C version using the same code generator. ---------------- /******************************************************************** ** File: memcpy.c ** ** Copyright (C) 1999-2010 Daniel Vik ** ** This software is provided 'as-is', without any express or implied ** warranty. In no event will the authors be held liable for any ** damages arising from the use of this software. ** Permission is granted to anyone to use this software for any ** purpose, including commercial applications, and to alter it and ** redistribute it freely, subject to the following restrictions: ** ** 1. The origin of this software must not be misrepresented; you ** must not claim that you wrote the original software. If you ** use this software in a product, an acknowledgment in the ** use this software in a product, an acknowledgment in the ** product documentation would be appreciated but is not ** required. ** ** 2. Altered source versions must be plainly marked as such, and ** must not be misrepresented as being the original software. ** ** 3. This notice may not be removed or altered from any source ** distribution. ** ** ** Description: Implementation of the standard library function memcpy. ** This implementation of memcpy() is ANSI-C89 compatible. ** ** The following configuration options can be set: ** ** LITTLE_ENDIAN - Uses processor with little endian ** addressing. Default is big endian. ** ** PRE_INC_PTRS - Use pre increment of pointers. ** Default is post increment of ** pointers. ** ** INDEXED_COPY - Copying data using array indexing. ** Using this option, disables the ** PRE_INC_PTRS option. ** ** MEMCPY_64BIT - Compiles memcpy for 64 bit ** architectures ** ** ** Best Settings: ** ** Intel x86: LITTLE_ENDIAN and INDEXED_COPY ** *******************************************************************/ module memcpy; /******************************************************************** ** Configuration definitions. *******************************************************************/ version = LITTLE_ENDIAN; version = INDEXED_COPY; /******************************************************************** ** Includes for size_t definition *******************************************************************/ /******************************************************************** ** Typedefs *******************************************************************/ alias ubyte UInt8; alias ushort UInt16; alias uint UInt32; alias ulong UInt64; version (D_LP64) { alias UInt64 UIntN; enum TYPE_WIDTH = 8; } else { alias UInt32 UIntN; enum TYPE_WIDTH = 4; } /******************************************************************** ** Remove definitions when INDEXED_COPY is defined. *******************************************************************/ //#if defined (INDEXED_COPY) //#if defined (PRE_INC_PTRS) //#undef PRE_INC_PTRS //#endif /*PRE_INC_PTRS*/ //#endif /*INDEXED_COPY*/ /******************************************************************** ** Definitions for pre and post increment of pointers. *******************************************************************/ version (PRE_INC_PTRS) { void START_VAL(ref UInt8* x) { x--; } ref T INC_VAL(T)(ref T* x) { return *++x; } UInt8* CAST_TO_U8(void* p, int o) { return cast(UInt8*)p + o + TYPE_WIDTH; } enum WHILE_DEST_BREAK = (TYPE_WIDTH - 1); enum PRE_LOOP_ADJUST = -(TYPE_WIDTH - 1); enum PRE_SWITCH_ADJUST = 1; } else { void START_VAL(UInt8* x) { } ref T INC_VAL(T)(ref T* x) { return *x++; } UInt8* CAST_TO_U8(void* p, int o) { return cast(UInt8*)p + o; } enum WHILE_DEST_BREAK = 0; enum PRE_LOOP_ADJUST = 0; enum PRE_SWITCH_ADJUST = 0; } /******************************************************************** ** ** void *memcpy(void *dest, const void *src, size_t count) ** ** Args: dest - pointer to destination buffer ** src - pointer to source buffer ** count - number of bytes to copy ** ** Return: A pointer to destination buffer ** ** Purpose: Copies count bytes from src to dest. ** No overlap check is performed. ** *******************************************************************/ void *memcpy(void *dest, const void *src, size_t count) { auto dst8 = cast(UInt8*)dest; auto src8 = cast(UInt8*)src; UIntN* dstN; UIntN* srcN; UIntN dstWord; UIntN srcWord; /******************************************************************** ** Macros for copying words of different alignment. ** Uses incremening pointers. *******************************************************************/ void CP_INCR() { INC_VAL(dstN) = INC_VAL(srcN); } void CP_INCR_SH(int shl, int shr) { version (LITTLE_ENDIAN) { dstWord = srcWord >> shl; srcWord = INC_VAL(srcN); dstWord |= srcWord << shr; INC_VAL(dstN) = dstWord; } else { dstWord = srcWord << shl; srcWord = INC_VAL(srcN); dstWord |= srcWord >> shr; INC_VAL(dstN) = dstWord; } } /******************************************************************** ** Macros for copying words of different alignment. ** Uses array indexes. *******************************************************************/ void CP_INDEX(size_t idx) { dstN[idx] = srcN[idx]; } void CP_INDEX_SH(size_t x, int shl, int shr) { version (LITTLE_ENDIAN) { dstWord = srcWord >> shl; srcWord = srcN[x]; dstWord |= srcWord << shr; dstN[x] = dstWord; } else { dstWord = srcWord << shl; srcWord = srcN[x]; dstWord |= srcWord >> shr; dstN[x] = dstWord; } } /******************************************************************** ** Macros for copying words of different alignment. ** Uses incremening pointers or array indexes depending on ** configuration. *******************************************************************/ version (INDEXED_COPY) { void CP(size_t idx) { CP_INDEX(idx); } void CP_SH(size_t idx, int shl, int shr) { CP_INDEX_SH(idx, shl, shr); } void INC_INDEX(T)(ref T* p, size_t o) { p += o; } } else { void CP(size_t idx) { CP_INCR(); } void CP_SH(size_t idx, int shl, int shr) { CP_INCR_SH(shl, shr); } void INC_INDEX(T)(T* p, size_t o) { } } static immutable string COPY_REMAINING = q{ START_VAL(dst8); START_VAL(src8); switch (cnt) { case 7: INC_VAL(dst8) = INC_VAL(src8); case 6: INC_VAL(dst8) = INC_VAL(src8); case 5: INC_VAL(dst8) = INC_VAL(src8); case 4: INC_VAL(dst8) = INC_VAL(src8); case 3: INC_VAL(dst8) = INC_VAL(src8); case 2: INC_VAL(dst8) = INC_VAL(src8); case 1: INC_VAL(dst8) = INC_VAL(src8); case 0: default: break; } }; static immutable string COPY_NO_SHIFT = q{ dstN = cast(UIntN*)(dst8 + PRE_LOOP_ADJUST); srcN = cast(UIntN*)(src8 + PRE_LOOP_ADJUST); size_t length = count / TYPE_WIDTH; while (length & 7) { CP_INCR(); length--; } length /= 8; while (length--) { CP(0); CP(1); CP(2); CP(3); CP(4); CP(5); CP(6); CP(7); INC_INDEX(dstN, 8); INC_INDEX(srcN, 8); } src8 = CAST_TO_U8(srcN, 0); dst8 = CAST_TO_U8(dstN, 0); { const cnt = (count & (TYPE_WIDTH - 1)); mixin(COPY_REMAINING); } }; static immutable string COPY_SHIFT = q{ dstN = cast(UIntN*)(((cast(UIntN)dst8) + PRE_LOOP_ADJUST) & ~(TYPE_WIDTH - 1)); srcN = cast(UIntN*)(((cast(UIntN)src8) + PRE_LOOP_ADJUST) & ~(TYPE_WIDTH - 1)); size_t length = count / TYPE_WIDTH; srcWord = INC_VAL(srcN); while (length & 7) { CP_INCR_SH(8 * shift, 8 * (TYPE_WIDTH - shift)); length--; } length /= 8; while (length--) { CP_SH(0, 8 * shift, 8 * (TYPE_WIDTH - shift)); CP_SH(1, 8 * shift, 8 * (TYPE_WIDTH - shift)); CP_SH(2, 8 * shift, 8 * (TYPE_WIDTH - shift)); CP_SH(3, 8 * shift, 8 * (TYPE_WIDTH - shift)); CP_SH(4, 8 * shift, 8 * (TYPE_WIDTH - shift)); CP_SH(5, 8 * shift, 8 * (TYPE_WIDTH - shift)); CP_SH(6, 8 * shift, 8 * (TYPE_WIDTH - shift)); CP_SH(7, 8 * shift, 8 * (TYPE_WIDTH - shift)); INC_INDEX(dstN, 8); INC_INDEX(srcN, 8); } src8 = CAST_TO_U8(srcN, (shift - TYPE_WIDTH)); dst8 = CAST_TO_U8(dstN, 0); { const cnt = (count & (TYPE_WIDTH - 1)); mixin(COPY_REMAINING); } }; if (count < 8) { const cnt = count; mixin(COPY_REMAINING); return dest; } START_VAL(dst8); START_VAL(src8); while ((cast(UIntN)dst8 & (TYPE_WIDTH - 1)) != WHILE_DEST_BREAK) { INC_VAL(dst8) = INC_VAL(src8); count--; } final switch (((cast(UIntN)src8) + PRE_SWITCH_ADJUST) & (TYPE_WIDTH - 1)) { case 0: mixin(COPY_NO_SHIFT); break; case 1: { const shift = 1; mixin(COPY_SHIFT); } break; case 2: { const shift = 2; mixin(COPY_SHIFT); } break; case 3: { const shift = 3; mixin(COPY_SHIFT); } break; static if (TYPE_WIDTH >= 4) { case 4: { const shift = 4; mixin(COPY_SHIFT); } break; case 5: { const shift = 5; mixin(COPY_SHIFT); } break; case 6: { const shift = 6; mixin(COPY_SHIFT); } break; case 7: { const shift = 7; mixin(COPY_SHIFT); } break; } } return dest; }
Dec 29 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/29/11 2:29 PM, Walter Bright wrote:
 On 12/29/2011 11:47 AM, Walter Bright wrote:
 On 12/29/2011 3:19 AM, Vladimir Panteleev wrote:
 I'd like to invite you to translate Daniel Vik's C memcpy
 implementation to D:
 http://www.danielvik.com/2010/02/fast-memcpy-in-c.html
Challenge accepted.
Here's another version that uses string mixins to ensure inlining of the COPY functions. There are no call instructions in the generated code. This should be as good as the C version using the same code generator.
[snip] In other news, TAB has died with Kim-Jong Il. Please stop using it. Andrei
Dec 29 2011
parent "Nick Sabalausky" <a a.a> writes:
"Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message 
news:jdilar$k66$1 digitalmars.com...
 On 12/29/11 2:29 PM, Walter Bright wrote:
 On 12/29/2011 11:47 AM, Walter Bright wrote:
 On 12/29/2011 3:19 AM, Vladimir Panteleev wrote:
 I'd like to invite you to translate Daniel Vik's C memcpy
 implementation to D:
 http://www.danielvik.com/2010/02/fast-memcpy-in-c.html
Challenge accepted.
Here's another version that uses string mixins to ensure inlining of the COPY functions. There are no call instructions in the generated code. This should be as good as the C version using the same code generator.
[snip] In other news, TAB has died with Kim-Jong Il. Please stop using it.
Tab is indeed evil when certain people insist it should be size 8 ;)
Jan 02 2012
prev sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Thursday, 29 December 2011 at 19:47:39 UTC, Walter Bright 
wrote:
 On 12/29/2011 3:19 AM, Vladimir Panteleev wrote:
 I'd like to invite you to translate Daniel Vik's C memcpy 
 implementation to D:
 http://www.danielvik.com/2010/02/fast-memcpy-in-c.html
Challenge accepted.
Ah, a direct translation using functions! This is probably the most elegant approach, however - as I'm sure you've noticed - the programmer has no control over what gets inlined.
 Examining the assembler output, it inlines everything except 
 COPY_SHIFT, COPY_NO_SHIFT, and COPY_REMAINING. The inliner in 
 dmd could definitely be improved, but that is not a problem 
 with the language, but the implementation.
This is the problem with heuristic inlining: while great by itself, in a position such as this the programmer is left with no choice but to examine the assembler output to make sure the compiler does what the programmer wants it to do. Such behavior can change from one implementation to another, and even from one compiler version to another. (After all, I don't think that we can guarantee that what's inlined today, will be inlined tomorrow.)
 Continuing in that vein, please note that neither C nor C++ 
 require inlining of any sort. The "inline" keyword is merely a 
 hint to the compiler. What inlining takes place is completely 
 implementation defined, not language defined.
I think we can agree that the C inline hint is of limited use. However, major C compiler vendors implement an extension to force inlining. Generally, I would say that common vendor extensions seen in other languages are an opportunity for D to avoid a similar mess: such extensions would not have to be required to be implemented, but when they are, they would use the same syntax across implementations.
 I wish to note that the D version semantically accomplishes the 
 same thing as the C version without using mixins or CTFE - it's 
 all straightforward code, without the abusive preprocessor 
 tricks.
I don't think there's much value in that statement. After all, except for a few occasional templates (which weren't strictly necessary), your translation uses few D-specific features. If you were to leave yourself at the mercy of a C compiler's optimizer, your rewrite would merely be a testament against C macros, not the power of D. However, the most important part is: this translation is incorrect. C macros in the original code provide a guarantee that the code is inlined. D cannot make such guarantees - even your amended version is tuned to one specific implementation (and possibly, only a specific range of versions of it).
Dec 29 2011
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/29/2011 9:51 PM, Vladimir Panteleev wrote:
 Ah, a direct translation using functions! This is probably the most elegant
 approach, however - as I'm sure you've noticed - the programmer has no control
 over what gets inlined.
The programmer also has no control over which variables go into which registers. (Early C compilers did provide this.)
 I think we can agree that the C inline hint is of limited use. However, major C
 compiler vendors implement an extension to force inlining.
I know.
 I don't think there's much value in that statement. After all, except for a few
 occasional templates (which weren't strictly necessary), your translation uses
 few D-specific features. If you were to leave yourself at the mercy of a C
 compiler's optimizer, your rewrite would merely be a testament against C
macros,
 not the power of D.
I think this criticism is off target, because the C example was almost entirely macros - and macros that were used in the service of evading C language limitations. The point wasn't to use clever D features, the challenge was to demonstrate you can get the same results in D as in C.
 However, the most important part is: this translation is incorrect. C macros in
 the original code provide a guarantee that the code is inlined. D cannot make
 such guarantees - even your amended version is tuned to one specific
 implementation (and possibly, only a specific range of versions of it).
I also think this is off target, because a C compiler really doesn't guarantee **** about efficiency, it only guarantees that it will work "as if" it was executed on some idealized abstract machine. Even dividing code up into functions is completely arbitrary, and open to wildly different strategies that are perfectly legal to any C compiler. A C compiler doesn't have to enregister anything in variables, either, and that has far more of a performance impact than inlining. There are a very wide range of code generation techniques that compilers employ. All of them, to verify that they are being applied, require inspection of the assembler output. Many argue that the compiler should tell you about inlining - but what about all those others? I think the focus on inlining (as opposed to other possible optimizations) is out of proportion, likely exacerbated by dmd needing to do a better job of it. I completely agree that DMD's inliner is underpowered and needs improvement. I am less sure that this demonstrates that the language needs changes. Functions below a certain size should be inlined if possible. Those above that size do not benefit perceptibly from inlining. Where that certain size exactly is, who knows, but I doubt that functions near that size will benefit much from user intervention.
Dec 29 2011
next sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Friday, 30 December 2011 at 06:53:06 UTC, Walter Bright wrote:
 I think this criticism is off target, because the C example was 
 almost entirely macros - and macros that were used in the 
 service of evading C language limitations. The point wasn't to 
 use clever D features, the challenge was to demonstrate you can 
 get the same results in D as in C.
...
 I also think this is off target, because a C compiler really 
 doesn't guarantee **** about efficiency, it only guarantees 
 that it will work "as if" it was executed on some idealized 
 abstract machine. Even dividing code up into functions is 
 completely arbitrary, and open to wildly different strategies 
 that are perfectly legal to any C compiler. A C compiler 
 doesn't have to enregister anything in variables, either, and 
 that has far more of a performance impact than inlining.
Even though the core language (of C and D) are not specific to any one platform, writing fast code has never been about targeting abstract idealized virtual machines. Some assumptions need to be made. Most assumptions that the C memcpy code makes can be expected to generally be true across major C compilers (e.g. macros are at least as fast as regular functions). However, your D port makes some rather fragile assumptions regarding the compiler implementation. Let's eliminate the language distinction, and consider two memcpy versions - one using macros, the other using functions (not even with "inline"). Would you say that the second is generally as fast as the first? I'm being intentionally vague: saying that their performance is "about the same" is holding on MUCH more fragile assumptions. The fact that major compiler vendors implement language extensions to facilitate writing optimized code shows that there is a demand for it. Even compilers that are great at optimization (GCC, LLVM) have such intrinsics. I'm not necessarily advocating changing the core language (e.g. new attributes, things that would need to go into TDPLv2). However, what I think would greatly improve the situation is to have DigitalMars provide recommendations for implementation-specific extensions that provide more control with regards to how the code is compiled (pragma names, keywords starting with __, etc.). Once they're defined, pull requests to add them to DMD will follow.
 Functions below a certain size should be inlined if possible. 
 Those above that size do not benefit perceptibly from inlining. 
 Where that certain size exactly is, who knows, but I doubt that 
 functions near that size will benefit much from user 
 intervention.
I agree, but this wasn't as much about heuristics, but compiler capabilities (e.g. inlining assembler functions).
Dec 30 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/30/2011 12:16 AM, Vladimir Panteleev wrote:
 I agree, but this wasn't as much about heuristics, but compiler capabilities
 (e.g. inlining assembler functions).
Adding a keyword won't fix the current problem that the compiler won't inline inline assembler functions. It's an orthogonal issue. I know there are features on various C compilers to force inlining, I know there's a demand for them. But I've also, over the years, spent thousands and thousands of hours optimizing the hell out of things, so I have some experience with it. Once the compiler gets past a certain level of heuristic inlining decisions, forcing it to inline more is just chasing rainbows. And if one really wants to force an inline, one can do things like the C memcpy using the preprocessor, or string mixins in D, or even cut&paste. If you need to do that in more than a couple places in the code, something else is wrong (that old saw about only a tiny percentage of the code being a bottleneck is true). Also, if you are tweaking at such a level, every compiler is different enough that your tweaks are likely to be counterproductive on another compiler. Having a portable syntax for such tweaking is not going to help.
Dec 30 2011
next sibling parent Peter Alexander <peter.alexander.au gmail.com> writes:
On 30/12/11 9:13 AM, Walter Bright wrote:
 And if one really wants to force an inline, one can do things like the C
 memcpy using the preprocessor, or string mixins in D, or even cut&paste.
 If you need to do that in more than a couple places in the code,
 something else is wrong (that old saw about only a tiny percentage of
 the code being a bottleneck is true).
When you are writing really performance sensitive code, that old adage is certainly *not* true. It only happens in practice when you don't care that much about performance. When you really care, you've already optimised those hot spots, so what you end up with is a completely flat profile: no part of the program is the bottleneck, but the whole thing is. At that point, you're likely suffering a death from a thousand cuts: no single part of your program is the bottleneck; your poor performance is just the sum total of a bunch of small performance penalties here and there. A perfect example of this is vector operations. Games use vector operations all over the place, so their impact on performance is spread out over the entire program. You'll never see a dot product or vector addition routine at the top of a profile chart, but it will certainly affect performance!
Dec 30 2011
prev sibling next sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Friday, 30 December 2011 at 09:13:05 UTC, Walter Bright wrote:
 Also, if you are tweaking at such a level, every compiler is 
 different enough that your tweaks are likely to be 
 counterproductive on another compiler. Having a portable syntax 
 for such tweaking is not going to help.
Which is exactly why I think an inlining pragma/attribute should provide a guarantee, and not a hint. It's a web of assumptions/guarantees: asm blocks provide their guarantees, but using them introduces new assumptions, that e.g. force-inlining solidifies, etc. Back to the macros vs.
 And if one really wants to force an inline, one can do things 
 like the C memcpy using the preprocessor, or string mixins in 
 D, or even cut&paste.
D has nothing from the above that's elegant and maintainable. Timon's solution comes close, but it uses a DSL to make up for what the language doesn't provide.
 If you need to do that in more than a couple places in the 
 code, something else is wrong (that old saw about only a tiny 
 percentage of the code being a bottleneck is true).
What about the context of creating an optimized library, as opposed to optimizing one application?
Dec 30 2011
next sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Friday, 30 December 2011 at 12:00:07 UTC, Vladimir Panteleev 
wrote:
 Back to the macros vs.
Oops, didn't mean to send that. I was going to write that comparing C macros with __forceinline functions is much more of a level comparison.
Dec 30 2011
prev sibling parent reply so <so so.so> writes:
On Fri, 30 Dec 2011 14:00:06 +0200, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Friday, 30 December 2011 at 09:13:05 UTC, Walter Bright wrote:
 Also, if you are tweaking at such a level, every compiler is different  
 enough that your tweaks are likely to be counterproductive on another  
 compiler. Having a portable syntax for such tweaking is not going to  
 help.
Which is exactly why I think an inlining pragma/attribute should provide a guarantee, and not a hint. It's a web of assumptions/guarantees: asm blocks provide their guarantees, but using them introduces new assumptions, that e.g. force-inlining solidifies, etc.
I agree inline (which will probably be an extension) in D should mean force-inline. Ignoring the impossible-to-inline cases (which in time should get better), adding inline is a few minutes of editing. It will just bypass the cost function and if it is not possible to inline, pop error. I don't have enough knowledge of DMD internals so i am not sure if should go do it, or maybe i need to start somewhere...
Dec 30 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/30/2011 7:06 AM, so wrote:
 I agree  inline (which will probably be an extension) in D should mean
 force-inline.
 Ignoring the impossible-to-inline cases (which in time should get better),
 adding  inline is a few minutes of editing.
 It will just bypass the cost function and if it is not possible to inline, pop
 error.
Sure, but I think you'll be very disappointed in that it isn't going to deliver the goods.
Dec 30 2011
next sibling parent Chad J <chadjoan __spam.is.bad__gmail.com> writes:
On 12/30/2011 01:48 PM, Walter Bright wrote:
 On 12/30/2011 7:06 AM, so wrote:
 I agree  inline (which will probably be an extension) in D should mean
 force-inline.
 Ignoring the impossible-to-inline cases (which in time should get
 better),
 adding  inline is a few minutes of editing.
 It will just bypass the cost function and if it is not possible to
 inline, pop
 error.
Sure, but I think you'll be very disappointed in that it isn't going to deliver the goods.
Cool. Put it in and let people use it and get disappointed. Then maybe they will blame themselves instead of DMD. ????. Profit.
Dec 30 2011
prev sibling parent reply so <so so.so> writes:
On Fri, 30 Dec 2011 20:48:54 +0200, Walter Bright  
<newshound2 digitalmars.com> wrote:

 On 12/30/2011 7:06 AM, so wrote:
 I agree  inline (which will probably be an extension) in D should mean
 force-inline.
 Ignoring the impossible-to-inline cases (which in time should get  
 better),
 adding  inline is a few minutes of editing.
 It will just bypass the cost function and if it is not possible to  
 inline, pop
 error.
Sure, but I think you'll be very disappointed in that it isn't going to deliver the goods.
dmd_inl -O -inline test.d dmd_inl -O -inline test_inl.d time ./test real 0m4.686s user 0m3.516s sys 0m0.007s time ./test_inl real 0m1.900s user 0m1.503s sys 0m0.007s time ./test real 0m4.381s user 0m3.520s sys 0m0.010s time ./test_inl real 0m1.955s user 0m1.473s sys 0m0.037s time ./test real 0m4.473s user 0m3.506s sys 0m0.017s time ./test_inl real 0m1.836s user 0m1.507s sys 0m0.007s time ./test real 0m4.627s user 0m3.523s sys 0m0.003s time ./test_inl real 0m1.984s user 0m1.480s sys 0m0.030s Just bypassing cost escape, I ll try some complex cases soon after i get phobos working. int test() // test.d int test() inline // test_inl.d { int i = 0; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; ++i; return i; } void main() { for(uint i=0; i<1_000_000_000; ++i) test(); }
Dec 30 2011
parent reply Iain Buclaw <ibuclaw ubuntu.com> writes:
On 31 December 2011 00:48, so <so so.so> wrote:
 On Fri, 30 Dec 2011 20:48:54 +0200, Walter Bright
 <newshound2 digitalmars.com> wrote:

 On 12/30/2011 7:06 AM, so wrote:
 I agree  inline (which will probably be an extension) in D should mean
 force-inline.
 Ignoring the impossible-to-inline cases (which in time should get
 better),
 adding  inline is a few minutes of editing.
 It will just bypass the cost function and if it is not possible to
 inline, pop
 error.
Sure, but I think you'll be very disappointed in that it isn't going to deliver the goods.
dmd_inl -O -inline test.d dmd_inl -O -inline test_inl.d time ./test real =A0 =A00m4.686s user =A0 =A00m3.516s sys =A0 =A0 0m0.007s time ./test_inl real =A0 =A00m1.900s user =A0 =A00m1.503s sys =A0 =A0 0m0.007s time ./test
*SNIP*
 void main()
 {
 =A0 =A0 =A0 =A0for(uint i=3D0; i<1_000_000_000; ++i)
 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0test();
 }
A better compiler would see that the function 'test' has no side effects, and it's return value is unused, so elimates the call to it completely as dead code. --=20 Iain Buclaw *(p < e ? p++ : p) =3D (c & 0x0f) + '0';
Dec 30 2011
parent reply so <so so.so> writes:
On Sat, 31 Dec 2011 03:12:38 +0200, Iain Buclaw <ibuclaw ubuntu.com> wrote:

 On 31 December 2011 00:48, so <so so.so> wrote:
 On Fri, 30 Dec 2011 20:48:54 +0200, Walter Bright
 <newshound2 digitalmars.com> wrote:

 On 12/30/2011 7:06 AM, so wrote:
 I agree  inline (which will probably be an extension) in D should mean
 force-inline.
 Ignoring the impossible-to-inline cases (which in time should get
 better),
 adding  inline is a few minutes of editing.
 It will just bypass the cost function and if it is not possible to
 inline, pop
 error.
Sure, but I think you'll be very disappointed in that it isn't going to deliver the goods.
dmd_inl -O -inline test.d dmd_inl -O -inline test_inl.d time ./test real 0m4.686s user 0m3.516s sys 0m0.007s time ./test_inl real 0m1.900s user 0m1.503s sys 0m0.007s time ./test
*SNIP*
 void main()
 {
        for(uint i=0; i<1_000_000_000; ++i)
                test();
 }
A better compiler would see that the function 'test' has no side effects, and it's return value is unused, so elimates the call to it completely as dead code.
It is just a dummy function that dmd rejected to inline, send me a better one (which won't use any libraries) and i'll use it :)
Dec 30 2011
parent reply Iain Buclaw <ibuclaw ubuntu.com> writes:
On 31 December 2011 01:21, so <so so.so> wrote:
 On Sat, 31 Dec 2011 03:12:38 +0200, Iain Buclaw <ibuclaw ubuntu.com> wrot=
e:
 On 31 December 2011 00:48, so <so so.so> wrote:
 On Fri, 30 Dec 2011 20:48:54 +0200, Walter Bright
 <newshound2 digitalmars.com> wrote:

 On 12/30/2011 7:06 AM, so wrote:
 I agree  inline (which will probably be an extension) in D should mea=
n
 force-inline.
 Ignoring the impossible-to-inline cases (which in time should get
 better),
 adding  inline is a few minutes of editing.
 It will just bypass the cost function and if it is not possible to
 inline, pop
 error.
Sure, but I think you'll be very disappointed in that it isn't going t=
o
 deliver the goods.
dmd_inl -O -inline test.d dmd_inl -O -inline test_inl.d time ./test real =A0 =A00m4.686s user =A0 =A00m3.516s sys =A0 =A0 0m0.007s time ./test_inl real =A0 =A00m1.900s user =A0 =A00m1.503s sys =A0 =A0 0m0.007s time ./test
*SNIP*
 void main()
 {
 =A0 =A0 =A0 for(uint i=3D0; i<1_000_000_000; ++i)
 =A0 =A0 =A0 =A0 =A0 =A0 =A0 test();
 }
A better compiler would see that the function 'test' has no side effects, and it's return value is unused, so elimates the call to it completely as dead code.
It is just a dummy function that dmd rejected to inline, send me a better one (which won't use any libraries) and i'll use it :)
Take a pick of any examples posted on this ML. They are far better fit to use as a test bed. Ideally one that does number crunching and can't be easily folded away. Regards --=20 Iain Buclaw *(p < e ? p++ : p) =3D (c & 0x0f) + '0';
Dec 30 2011
next sibling parent reply so <so so.so> writes:
On Sat, 31 Dec 2011 03:40:43 +0200, Iain Buclaw <ibuclaw ubuntu.com> wrote:

 Take a pick of any examples posted on this ML.  They are far better
 fit to use as a test bed.  Ideally one that does number crunching and
 can't be easily folded away.
Well not them but another dummy function, i didn't think it would differ this much. time ./test_inl real 0m0.013s user 0m0.007s sys 0m0.003s time ./test real 0m7.753s user 0m5.966s sys 0m0.013s time ./test_inl real 0m0.013s user 0m0.010s sys 0m0.000s time ./test real 0m7.391s user 0m5.960s sys 0m0.017s time ./test_inl real 0m0.014s user 0m0.007s sys 0m0.003s time ./test real 0m7.582s user 0m5.950s sys 0m0.030s real test() // test.d real test() inline // test_inl.d { real a=423123, b=432, c=10, d=100, e=4045, f=123; a = a / b * c / d + e - f; b = a / b * c / d + e - f; c = a / b * c / d + e - f; d = a / b * c / d + e - f; e = a / b * c / d + e - f; f = a / b * c / d + e - f; a = a / b * c / d + e - f; b = a / b * c / d + e - f; c = a / b * c / d + e - f; d = a / b * c / d + e - f; e = a / b * c / d + e - f; f = a / b * c / d + e - f; a = a / b * c / d + e - f; b = a / b * c / d + e - f; c = a / b * c / d + e - f; d = a / b * c / d + e - f; e = a / b * c / d + e - f; f = a / b * c / d + e - f; return f; } void main() { for(uint i=0; i<1_000_000_0; ++i) test(); }
Dec 30 2011
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/30/2011 5:59 PM, so wrote:
 Well not them but another dummy function, i didn't think it would differ this
much.
It differs that much because once it is inlined, the optimizer deletes it because it does nothing. I don't think it is a valid test.
Dec 30 2011
parent reply so <so so.so> writes:
On Sat, 31 Dec 2011 04:30:01 +0200, Walter Bright  
<newshound2 digitalmars.com> wrote:

 On 12/30/2011 5:59 PM, so wrote:
 Well not them but another dummy function, i didn't think it would  
 differ this much.
It differs that much because once it is inlined, the optimizer deletes it because it does nothing. I don't think it is a valid test.
Yes i can see from asm output but are we talking about same thing? inline IS all about that. We can try it with any example, one outperforming the other is not the point. for(....) fun() With or without inline i know fun should/will get folded away, then why should i pay for the function call?
Dec 30 2011
parent Walter Bright <newshound2 digitalmars.com> writes:
On 12/30/2011 6:35 PM, so wrote:
 With or without  inline i know fun should/will get folded away, then why should
 i pay for the function call?
Because if the function did anything useful, the overhead of the function call is insignificant. I don't think that dealing with large, complex functions that do nothing merits a language extension.
Dec 30 2011
prev sibling parent reply Mike Wey <mike-wey example.com> writes:
On 12/31/2011 02:59 AM, so wrote:
 On Sat, 31 Dec 2011 03:40:43 +0200, Iain Buclaw <ibuclaw ubuntu.com> wrote:

 Take a pick of any examples posted on this ML. They are far better
 fit to use as a test bed. Ideally one that does number crunching and
 can't be easily folded away.
Well not them but another dummy function, i didn't think it would differ this much.
real test() nothrow pure
 real test() // test.d
 real test()  inline // test_inl.d
 {
 real a=423123, b=432, c=10, d=100, e=4045, f=123;
 a = a / b * c / d + e - f;
 b = a / b * c / d + e - f;
 c = a / b * c / d + e - f;
 d = a / b * c / d + e - f;
 e = a / b * c / d + e - f;
 f = a / b * c / d + e - f;
 a = a / b * c / d + e - f;
 b = a / b * c / d + e - f;
 c = a / b * c / d + e - f;
 d = a / b * c / d + e - f;
 e = a / b * c / d + e - f;
 f = a / b * c / d + e - f;
 a = a / b * c / d + e - f;
 b = a / b * c / d + e - f;
 c = a / b * c / d + e - f;
 d = a / b * c / d + e - f;
 e = a / b * c / d + e - f;
 f = a / b * c / d + e - f;
 return f;
 }

 void main()
 {
 for(uint i=0; i<1_000_000_0; ++i)
 test();
 }
When marking the function as pure and nothrow dmd is able to optimize the loop: .text._Dmain segment assume CS:.text._Dmain _Dmain: push RBP mov RBP,RSP xor EAX,EAX L6: inc EAX cmp EAX,0989680h jb L6 xor EAX,EAX pop RBP ret .text._Dmain ends -- Mike Wey
Dec 31 2011
parent Iain Buclaw <ibuclaw ubuntu.com> writes:
On 31 December 2011 13:05, Mike Wey <mike-wey example.com> wrote:
 On 12/31/2011 02:59 AM, so wrote:
 On Sat, 31 Dec 2011 03:40:43 +0200, Iain Buclaw <ibuclaw ubuntu.com>
 wrote:

 Take a pick of any examples posted on this ML. They are far better
 fit to use as a test bed. Ideally one that does number crunching and
 can't be easily folded away.
Well not them but another dummy function, i didn't think it would differ this much.
real test() nothrow pure
 real test() // test.d
 real test()  inline // test_inl.d
 {
 real a=3D423123, b=3D432, c=3D10, d=3D100, e=3D4045, f=3D123;
 a =3D a / b * c / d + e - f;
 b =3D a / b * c / d + e - f;
 c =3D a / b * c / d + e - f;
 d =3D a / b * c / d + e - f;
 e =3D a / b * c / d + e - f;
 f =3D a / b * c / d + e - f;
 a =3D a / b * c / d + e - f;
 b =3D a / b * c / d + e - f;
 c =3D a / b * c / d + e - f;
 d =3D a / b * c / d + e - f;
 e =3D a / b * c / d + e - f;
 f =3D a / b * c / d + e - f;
 a =3D a / b * c / d + e - f;
 b =3D a / b * c / d + e - f;
 c =3D a / b * c / d + e - f;
 d =3D a / b * c / d + e - f;
 e =3D a / b * c / d + e - f;
 f =3D a / b * c / d + e - f;
 return f;
 }

 void main()
 {
 for(uint i=3D0; i<1_000_000_0; ++i)
 test();
 }
When marking the function as pure and nothrow dmd is able to optimize the loop: .text._Dmain =A0 =A0segment =A0 =A0 =A0 =A0assume =A0CS:.text._Dmain _Dmain: =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0push =A0 =A0RBP =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0mov =A0 =A0 RBP,RSP =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0xor =A0 =A0 EAX,EAX L6: =A0 =A0 =A0 =A0 =A0 =A0 inc =A0 =A0 EAX =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0cmp =A0 =A0 EAX,0989680h =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0jb =A0 =A0 =A0L6 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0xor =A0 =A0 EAX,EAX =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0pop =A0 =A0 RBP =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0ret .text._Dmain =A0 =A0ends -- Mike Wey
Yep, as I've mentioned earlier, the function has no side effects and it's return value is not used, hence can be optimised away completely. --=20 Iain Buclaw *(p < e ? p++ : p) =3D (c & 0x0f) + '0';
Dec 31 2011
prev sibling parent so <so so.so> writes:
On Sat, 31 Dec 2011 03:40:43 +0200, Iain Buclaw <ibuclaw ubuntu.com> wrote:

 Take a pick of any examples posted on this ML.  They are far better
 fit to use as a test bed.  Ideally one that does number crunching and
 can't be easily folded away.
I don't understand your point btw, why it shouldn't be easily folded away? inline is exactly for that reason, why would i pay for something i don't want?
Dec 30 2011
prev sibling parent reply Chad J <chadjoan __spam.is.bad__gmail.com> writes:
On 12/30/2011 04:13 AM, Walter Bright wrote:
 On 12/30/2011 12:16 AM, Vladimir Panteleev wrote:
 I agree, but this wasn't as much about heuristics, but compiler
 capabilities
 (e.g. inlining assembler functions).
Adding a keyword won't fix the current problem that the compiler won't inline inline assembler functions. It's an orthogonal issue. I know there are features on various C compilers to force inlining, I know there's a demand for them. But I've also, over the years, spent thousands and thousands of hours optimizing the hell out of things, so I have some experience with it. Once the compiler gets past a certain level of heuristic inlining decisions, forcing it to inline more is just chasing rainbows.
When a compiler ISN'T past a certain level of heuristic inlining, then being able to tell it to inline can save one's ass. I hit this when writing a flash game. It was doing the slide-show thing while on a collision detection broadphase (IIRC) when it went to sort everything. The language I was using, haXe, was pretty young at the time and the compiler probably wasn't inlining well. BUT, it did have an inline keyword. I plopped it down in a few select places and BAM,the broadphase is ~100x faster and life goes on. Things were going to get really damn ugly if I couldn't do that. (haXe is a pretty cool language, just not as featureful as D.) Nonetheless, this is the less important issue...
 And if one really wants to force an inline, one can do things like the C
 memcpy using the preprocessor, or string mixins in D, or even cut&paste.
 If you need to do that in more than a couple places in the code,
 something else is wrong (that old saw about only a tiny percentage of
 the code being a bottleneck is true).
 
 Also, if you are tweaking at such a level, every compiler is different
 enough that your tweaks are likely to be counterproductive on another
 compiler. Having a portable syntax for such tweaking is not going to help.
This is striking me as becoming a human factors problem. People want a way to tell the compiler to inline things. They are /going/ to get that, one way or another. It /will/ happen, regardless of how experienced /you/ are. They also may not go about it in entirely reasonable ways, and then you end up with code optimized for one compiler that doesn't compile at all on another. This sucks really bad for people compiling a program that they didn't write. And to me, that's what I worry about most. ... As an aside, I think that people want forced inlining because it gives them another tool to tweak with. My experiences with optimization tend to suggest I can usually optimize things really well with a few short cycles of profile->experiment->profile. I don't think I've ever really /needed/ to dive into assembly yet. My ventures into the assembler have been either purely recreational or academic in nature. Now, something like an inline feature can help a lot with the "experiment" part of the cycle. It's just another knob to twist and see if it gives the result you want. Portability be damned, if it gets the thing out the door, I'm using it! But, I kind of hate that attitude. So it's much more comforting to be able to twist that knob without sacrificing portability too. I wouldn't expect it to run as fast on other compilers; I /would/ expect it to compile and run correctly on other compilers. And if enregistering variables is more important, then we might want to have a way to enregister variables too.
Dec 30 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/30/2011 10:31 AM, Chad J wrote:
 As an aside, I think that people want forced inlining because it gives
 them another tool to tweak with.  My experiences with optimization tend
 to suggest I can usually optimize things really well with a few short
 cycles of profile->experiment->profile.  I don't think I've ever really
 /needed/ to dive into assembly yet.  My ventures into the assembler have
 been either purely recreational or academic in nature.  Now, something
 like an inline feature can help a lot with the "experiment" part of the
 cycle.  It's just another knob to twist and see if it gives the result
 you want.  Portability be damned, if it gets the thing out the door, I'm
 using it!  But, I kind of hate that attitude.  So it's much more
 comforting to be able to twist that knob without sacrificing portability
 too.  I wouldn't expect it to run as fast on other compilers; I /would/
 expect it to compile and run correctly on other compilers.  And if
 enregistering variables is more important, then we might want to have a
 way to enregister variables too.
Back in the olden days, I provided a detailed list of optimizer switches that turned on/off all sorts of optimizations. In the end it turned out that all people wanted was an "optimize" switch which is why dmd has only -O. The reason dmd has a -inline switch is because it's hard to debug code that has been inlined. The reason C's "register" keyword went away was because: 1. the variables after optimization transformations may be very different than before 2. programmers stunk at picking the right variables for registers 3. even if (2) was done right, as soon as the first code maintainer dinked with it, they never bothered to go fix the register declarations 4. optimizers got pretty good at automatic register allocation 5. there's nothing portable about enregistering, even with a portable syntax 6. the register keyword offered no way to hint which variables were more important to enregister than others
Dec 30 2011
parent Chad J <chadjoan __spam.is.bad__gmail.com> writes:
On 12/30/2011 02:00 PM, Walter Bright wrote:
 On 12/30/2011 10:31 AM, Chad J wrote:
 As an aside, I think that people want forced inlining because it gives
 them another tool to tweak with.  My experiences with optimization tend
 to suggest I can usually optimize things really well with a few short
 cycles of profile->experiment->profile.  I don't think I've ever really
 /needed/ to dive into assembly yet.  My ventures into the assembler have
 been either purely recreational or academic in nature.  Now, something
 like an inline feature can help a lot with the "experiment" part of the
 cycle.  It's just another knob to twist and see if it gives the result
 you want.  Portability be damned, if it gets the thing out the door, I'm
 using it!  But, I kind of hate that attitude.  So it's much more
 comforting to be able to twist that knob without sacrificing portability
 too.  I wouldn't expect it to run as fast on other compilers; I /would/
 expect it to compile and run correctly on other compilers.  And if
 enregistering variables is more important, then we might want to have a
 way to enregister variables too.
Back in the olden days, I provided a detailed list of optimizer switches that turned on/off all sorts of optimizations. In the end it turned out that all people wanted was an "optimize" switch which is why dmd has only -O. The reason dmd has a -inline switch is because it's hard to debug code that has been inlined. The reason C's "register" keyword went away was because: 1. the variables after optimization transformations may be very different than before 2. programmers stunk at picking the right variables for registers 3. even if (2) was done right, as soon as the first code maintainer dinked with it, they never bothered to go fix the register declarations 4. optimizers got pretty good at automatic register allocation 5. there's nothing portable about enregistering, even with a portable syntax 6. the register keyword offered no way to hint which variables were more important to enregister than others
Huh, bummer dudes. 6 seems pretty solvable. Too bad about the other 5. ;)
Dec 30 2011
prev sibling parent Trass3r <un known.com> writes:
 I completely agree that DMD's inliner is underpowered and needs  
 improvement. I am less sure that this demonstrates that the language  
 needs changes.

 Functions below a certain size should be inlined if possible. Those  
 above that size do not benefit perceptibly from inlining. Where that  
 certain size exactly is, who knows, but I doubt that functions near that  
 size will benefit much from user intervention.
More specifically a distinction would be nice like gcc does: "-finline-small-functions Integrate functions into their callers when their body is smaller than expected function call code (so overall size of program gets smaller). The compiler heuristically decides which functions are simple enough to be worth integrating in this way. Enabled at level -O2. -finline-functions Integrate all simple functions into their callers. The compiler heuristically decides which functions are simple enough to be worth integrating in this way. Enabled at level -O3."
Dec 30 2011
prev sibling parent reply "Martin Nowak" <dawg dawgfoto.de> writes:
On Fri, 30 Dec 2011 06:51:44 +0100, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Thursday, 29 December 2011 at 19:47:39 UTC, Walter Bright wrote:
 On 12/29/2011 3:19 AM, Vladimir Panteleev wrote:
 I'd like to invite you to translate Daniel Vik's C memcpy  
 implementation to D:
 http://www.danielvik.com/2010/02/fast-memcpy-in-c.html
Challenge accepted.
Ah, a direct translation using functions! This is probably the most elegant approach, however - as I'm sure you've noticed - the programmer has no control over what gets inlined.
 Examining the assembler output, it inlines everything except  
 COPY_SHIFT, COPY_NO_SHIFT, and COPY_REMAINING. The inliner in dmd could  
 definitely be improved, but that is not a problem with the language,  
 but the implementation.
This is the problem with heuristic inlining: while great by itself, in a position such as this the programmer is left with no choice but to examine the assembler output to make sure the compiler does what the programmer wants it to do. Such behavior can change from one implementation to another, and even from one compiler version to another. (After all, I don't think that we can guarantee that what's inlined today, will be inlined tomorrow.)
For real performance bottlenecks one should always examine the assembly. For most code inlining hardly ever matters for the runtime of your program and focusing on efficient algorithms is most important. What really baffles me is that people want control over inlining but nobody seems to ever have noticed that x64 switch doesn't switch and x64 vector ops aren't vectorized. Both of which are really important in performance sensitive code.
 Continuing in that vein, please note that neither C nor C++ require  
 inlining of any sort. The "inline" keyword is merely a hint to the  
 compiler. What inlining takes place is completely implementation  
 defined, not language defined.
I think we can agree that the C inline hint is of limited use. However, major C compiler vendors implement an extension to force inlining. Generally, I would say that common vendor extensions seen in other languages are an opportunity for D to avoid a similar mess: such extensions would not have to be required to be implemented, but when they are, they would use the same syntax across implementations.
 I wish to note that the D version semantically accomplishes the same  
 thing as the C version without using mixins or CTFE - it's all  
 straightforward code, without the abusive preprocessor tricks.
I don't think there's much value in that statement. After all, except for a few occasional templates (which weren't strictly necessary), your translation uses few D-specific features. If you were to leave yourself at the mercy of a C compiler's optimizer, your rewrite would merely be a testament against C macros, not the power of D. However, the most important part is: this translation is incorrect. C macros in the original code provide a guarantee that the code is inlined. D cannot make such guarantees - even your amended version is tuned to one specific implementation (and possibly, only a specific range of versions of it).
Jan 03 2012
parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Tuesday, 3 January 2012 at 18:49:35 UTC, Martin Nowak wrote:
 For real performance bottlenecks one should always examine the 
 assembly. For most code inlining hardly ever matters for the 
 runtime of your program and focusing on efficient algorithms is 
 most important.

 What really baffles me is that people want control over inlining
 but nobody seems to ever have noticed that x64 switch doesn't 
 switch and x64 vector ops aren't vectorized. Both of which are 
 really important in performance sensitive code.
Quality of implementations' optimizations and a common syntax for code compilation guarantees are orthogonal issues.
Jan 04 2012
prev sibling next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 12/29/2011 12:19 PM, Vladimir Panteleev wrote:
 On Thursday, 29 December 2011 at 09:16:23 UTC, Walter Bright wrote:
 Are you a ridiculous hacker? Inline x86 assembly that the compiler
 actually understands in 32 AND 64 bit code, hex string literals like
 x"DE ADB EEF" where spacing doesn't matter, the ability to set data
 alignment cross-platform with type.alignof = 16, load your shellcode
 verbatim into a string like so: auto str = import("shellcode.txt");
I would like to talk about this for a bit. Personally, I think D's system programming abilities are only half-way there. Note that I am not talking about use cases in high-level application code, but rather low-level, widely-used framework code, where every bit of performance matters (for example: memory copy routines, string builders, garbage collectors). In-line assembler as part of the language is certainly neat, and in fact coming from Delphi to C++ I was surprised to learn that C++ implementations adopted different syntax for asm blocks. However, compared to some C++ compilers, it has severe limitations and is D's only trick in this alley. For one thing, there is no way to force the compiler to inline a function (like __forceinline / __attribute((always_inline)) ). This is fine for high-level code (where users are best left with PGO and "the compiler knows best"), but sucks if you need a guarantee that the function must be inlined. The guarantee isn't just about inlining heuristics, but also implementation capabilities. For example, some implementations might not be able to inline functions that use certain language features, and your code's performance could demand that such a short function must be inlined. One example of this is inlining functions containing asm blocks - IIRC DMD does not support this.
That does not mean the language does not support it, probably ldc and gdc can do it.
 The
 compiler should fail the build if it can't inline a function tagged with
  forceinline, instead of shrugging it off and failing silently, forcing
 users to check the disassembly every time.
+1. I think we should extend the 'enum' storage class to functions, and introduce cast(enum) to force instant evaluation. void foo() enum{...} // always inlined or compile error. declaration alone does not contribute code to the object file void goo(){...} // inlined at compiler's discretion void main(){ (cast(enum)goo)(); // inlined or compile error }
 You may have noticed that GCC has some ridiculously complicated
 assembler facilities. However, they also open the way to the
 possibilities of writing optimal code - for example, creating custom
 calling conventions, or inlining assembler functions without restricting
 the caller's register allocation with a predetermined calling
 convention. In contrast, DMD is very conservative when it comes to
 mixing D and assembler. One time I found that putting an asm block in a
 function turned what were single instructions into blocks of 6
 instructions each.

 D's lacking  in this area makes it impossible to create language features
 that are on the level of D's compiler built-ins. For example, I have
 tested three memcpy implementations recently, but none of them could
 beat DMD's standard array slice copy (despite that in release mode it
 compiles to a simple memcpy call). Why? Because the overhead of using a
 custom memcpy routine negated its performance gains.
I don't think you should use DMD to benchmark the D language.
 This might have been alleviated with the presence of sane macros, but no
 such luck. String mixins are not the answer: trying to translate
 macro-heavy C code to D using string mixins is string escape hell, and
 we're back to the level of shell scripts.
No string escape hell if you do it right.
 We've discussed this topic on IRC recently. From what I understood,
 Andrei thinks improvements in this area are not "impactful" enough,
 which I find worrisome.
Me too.
 Personally, I don't think D qualifies as a true "system programming
 language" in light of the above.
Neither do C or C++ without compiler specific extensions. We should definitely standardise such features in D.
 It's more of a compiled language with
 pointers and assembler. Before you disagree with any of the above, first
 (for starters) I'd like to invite you to translate Daniel Vik's C memcpy
 implementation to D:
 http://www.danielvik.com/2010/02/fast-memcpy-in-c.html . It doesn't even
 use inline assembler or compiler intrinsics.
OK, will do.
Dec 29 2011
next sibling parent David Nadlinger <see klickverbot.at> writes:
On 12/29/11 9:58 PM, Timon Gehr wrote:
 On 12/29/2011 12:19 PM, Vladimir Panteleev wrote:
 [â€Ļ]One example of this is inlining
 functions containing asm blocks - IIRC DMD does not support this.
That does not mean the language does not support it, probably ldc and gdc can do it.
LDC has pragma(allow_inline), which allows you to mark a function containing inline asm as safe to inline. David
Dec 29 2011
prev sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Thursday, 29 December 2011 at 20:58:59 UTC, Timon Gehr wrote:
 I don't think you should use DMD to benchmark the D language.
You're missing my point. We can't count that optimizers in all implementations will be perfect. I am suggesting language features which could provide guarantees to the programmer regarding how the code will be compiled. If an implementation cannot satisfy them, the programmer should be told so, so he could try something else - rather than having to sift through disassembler listings or use a profiler.
Dec 29 2011
prev sibling next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 12/29/2011 12:19 PM, Vladimir Panteleev wrote:
 Before you disagree with any of the above, first
 (for starters) I'd like to invite you to translate Daniel Vik's C memcpy
 implementation to D:
 http://www.danielvik.com/2010/02/fast-memcpy-in-c.html . It doesn't even
 use inline assembler or compiler intrinsics.
Ok, I have performed a direct translation (with all the preprocessor stuff replaced by string mixins). However, I think I could do a lot better starting from scratch in D. I have performed some basic testing with all the configuration options, and it seems to work correctly. // File: memcpy.d direct translation of memcpy.c /******************************************************************** ** File: memcpy.c ** ** Copyright (C) 1999-2010 Daniel Vik ** ** This software is provided 'as-is', without any express or implied ** warranty. In no event will the authors be held liable for any ** damages arising from the use of this software. ** Permission is granted to anyone to use this software for any ** purpose, including commercial applications, and to alter it and ** redistribute it freely, subject to the following restrictions: ** ** 1. The origin of this software must not be misrepresented; you ** must not claim that you wrote the original software. If you ** use this software in a product, an acknowledgment in the ** use this software in a product, an acknowledgment in the ** product documentation would be appreciated but is not ** required. ** ** 2. Altered source versions must be plainly marked as such, and ** must not be misrepresented as being the original software. ** ** 3. This notice may not be removed or altered from any source ** distribution. ** ** ** Description: Implementation of the standard library function memcpy. ** This implementation of memcpy() is ANSI-C89 compatible. ** ** The following configuration options can be set: ** ** LITTLE_ENDIAN - Uses processor with little endian ** addressing. Default is big endian. ** ** PRE_INC_PTRS - Use pre increment of pointers. ** Default is post increment of ** pointers. ** ** INDEXED_COPY - Copying data using array indexing. ** Using this option, disables the ** PRE_INC_PTRS option. ** ** MEMCPY_64BIT - Compiles memcpy for 64 bit ** architectures ** ** ** Best Settings: ** ** Intel x86: LITTLE_ENDIAN and INDEXED_COPY ** *******************************************************************/ /******************************************************************** ** Configuration definitions. *******************************************************************/ version = LITTLE_ENDIAN; version = INDEXED_COPY; /******************************************************************** ** Includes for size_t definition *******************************************************************/ /******************************************************************** ** Typedefs *******************************************************************/ version(MEMCPY_64BIT) version(D_LP32) static assert(0, "not a 64 bit compile"); version(D_LP64){ alias ulong UIntN; enum TYPE_WIDTH = 8; }else{ alias uint UIntN; enum TYPE_WIDTH = 4; } /******************************************************************** ** Remove definitions when INDEXED_COPY is defined. *******************************************************************/ version(INDEXED_COPY){ version(PRE_INC_PTRS) static assert(0, "cannot use INDEXED_COPY together with PRE_INC_PTRS!"); } /******************************************************************** ** The X template *******************************************************************/ string Ximpl(string x){ import utf = std.utf; string r=`"`; for(typeof(x.length) i=0;i<x.length;r~=x[i..i+utf.stride(x,i)],i+=utf.stride(x,i)){ if(x[i]==' '&&x[i+1]=='('){ auto start = ++i; int nest=1; while(nest){ i+=utf.stride(x,i); if(x[i]=='(') nest++; else if(x[i]==')') nest--; } i++; r~=`"~`~x[start..i]~`~"`; if(i==x.length) break; } if(x[i]=='"'||x[i]=='\\'){r~="\\"; continue;} } return r~`"`; } template X(string x){ enum X = Ximpl(x); } /******************************************************************** ** Definitions for pre and post increment of pointers. *******************************************************************/ // uses *(*&x)++ and similar to work around a bug in the parser version(PRE_INC_PTRS){ string START_VAL(string x) {return mixin(X!q{(*& (x))--;});} string INC_VAL(string x) {return mixin(X!q{*++(*& (x))});} string CAST_TO_U8(string p, string o){ return mixin(X!q{(cast(ubyte*) (p) + (o) + TYPE_WIDTH)}); } enum WHILE_DEST_BREAK = (TYPE_WIDTH - 1); enum PRE_LOOP_ADJUST = q{- (TYPE_WIDTH - 1)}; enum PRE_SWITCH_ADJUST = q{+ 1}; }else{ string START_VAL(string x) {return q{};} string INC_VAL(string x) {return mixin(X!q{*(*& (x))++});} string CAST_TO_U8(string p, string o){ return mixin(X!q{(cast(ubyte*) (p) + (o))}); } enum WHILE_DEST_BREAK = 0; enum PRE_LOOP_ADJUST = q{}; enum PRE_SWITCH_ADJUST = q{}; } /******************************************************************** ** Definitions for endians *******************************************************************/ version(LITTLE_ENDIAN){ enum SHL = q{>>}; enum SHR = q{<<}; }else{ enum SHL = q{<<}; enum SHR = q{>>}; } /******************************************************************** ** Macros for copying words of different alignment. ** Uses incremening pointers. *******************************************************************/ string CP_INCR() { return mixin(X!q{ (INC_VAL(q{dstN})) = (INC_VAL(q{srcN})); }); } string CP_INCR_SH(string shl, string shr) { return mixin(X!q{ dstWord = srcWord (SHL) (shl); srcWord = (INC_VAL(q{srcN})); dstWord |= srcWord (SHR) (shr); (INC_VAL(q{dstN})) = dstWord; }); } /******************************************************************** ** Macros for copying words of different alignment. ** Uses array indexes. *******************************************************************/ string CP_INDEX(string idx) { return mixin(X!q{ dstN[ (idx)] = srcN[ (idx)]; }); } string CP_INDEX_SH(string x, string shl, string shr) { return mixin(X!q{ dstWord = srcWord (SHL) (shl); srcWord = srcN[ (x)]; dstWord |= srcWord (SHR) (shr); dstN[ (x)]= dstWord; }); } /******************************************************************** ** Macros for copying words of different alignment. ** Uses incremening pointers or array indexes depending on ** configuration. *******************************************************************/ version(INDEXED_COPY){ alias CP_INDEX CP; alias CP_INDEX_SH CP_SH; string INC_INDEX(string p, string o){ return mixin(X!q{ (( (p)) += ( (o))); }); } }else{ string CP(string idx) {return mixin(X!q{ (CP_INCR())});} string CP_SH(string idx, string shl, string shr){ return mixin(X!q{ (CP_INCR_SH(mixin(X!q{ (shl)}), mixin(X!q{ (shr)}))); }); } string INC_INDEX(string p, string o){return q{};} } string COPY_REMAINING(string count) { return mixin(X!q{ (START_VAL(q{dst8})); (START_VAL(q{src8})); switch ( (count)) { case 7: (INC_VAL(q{dst8})) = (INC_VAL(q{src8})); case 6: (INC_VAL(q{dst8})) = (INC_VAL(q{src8})); case 5: (INC_VAL(q{dst8})) = (INC_VAL(q{src8})); case 4: (INC_VAL(q{dst8})) = (INC_VAL(q{src8})); case 3: (INC_VAL(q{dst8})) = (INC_VAL(q{src8})); case 2: (INC_VAL(q{dst8})) = (INC_VAL(q{src8})); case 1: (INC_VAL(q{dst8})) = (INC_VAL(q{src8})); case 0: default: break; } }); } string COPY_NO_SHIFT() { return mixin(X!q{ UIntN* dstN = cast(UIntN*)(dst8 (PRE_LOOP_ADJUST)); UIntN* srcN = cast(UIntN*)(src8 (PRE_LOOP_ADJUST)); size_t length = count / TYPE_WIDTH; while (length & 7) { (CP_INCR()); length--; } length /= 8; while (length--) { (CP(q{0})); (CP(q{1})); (CP(q{2})); (CP(q{3})); (CP(q{4})); (CP(q{5})); (CP(q{6})); (CP(q{7})); (INC_INDEX(q{dstN}, q{8})); (INC_INDEX(q{srcN}, q{8})); } src8 = (CAST_TO_U8(q{srcN}, q{0})); dst8 = (CAST_TO_U8(q{dstN}, q{0})); (COPY_REMAINING(q{count & (TYPE_WIDTH - 1)})); return dest; }); } string COPY_SHIFT(string shift) { return mixin(X!q{ UIntN* dstN = cast(UIntN*)(((cast(UIntN)dst8) (PRE_LOOP_ADJUST)) & ~(TYPE_WIDTH - 1)); UIntN* srcN = cast(UIntN*)(((cast(UIntN)src8) (PRE_LOOP_ADJUST)) & ~(TYPE_WIDTH - 1)); size_t length = count / TYPE_WIDTH; UIntN srcWord = (INC_VAL(q{srcN})); UIntN dstWord; while (length & 7) { (CP_INCR_SH(mixin(X!q{8 * (shift)}), mixin(X!q{8 * (TYPE_WIDTH - (shift))}))); length--; } length /= 8; while (length--) { (CP_SH(q{0}, mixin(X!q{8 * (shift)}), mixin(X!q{8 * (TYPE_WIDTH - (shift))}))); (CP_SH(q{1}, mixin(X!q{8 * (shift)}), mixin(X!q{8 * (TYPE_WIDTH - (shift))}))); (CP_SH(q{2}, mixin(X!q{8 * (shift)}), mixin(X!q{8 * (TYPE_WIDTH - (shift))}))); (CP_SH(q{3}, mixin(X!q{8 * (shift)}), mixin(X!q{8 * (TYPE_WIDTH - (shift))}))); (CP_SH(q{4}, mixin(X!q{8 * (shift)}), mixin(X!q{8 * (TYPE_WIDTH - (shift))}))); (CP_SH(q{5}, mixin(X!q{8 * (shift)}), mixin(X!q{8 * (TYPE_WIDTH - (shift))}))); (CP_SH(q{6}, mixin(X!q{8 * (shift)}), mixin(X!q{8 * (TYPE_WIDTH - (shift))}))); (CP_SH(q{7}, mixin(X!q{8 * (shift)}), mixin(X!q{8 * (TYPE_WIDTH - (shift))}))); (INC_INDEX(q{dstN}, q{8})); (INC_INDEX(q{srcN}, q{8})); } src8 = (CAST_TO_U8(q{srcN}, mixin(X!q{( (shift) - TYPE_WIDTH)}))); dst8 = (CAST_TO_U8(q{dstN}, q{0})); (COPY_REMAINING(q{count & (TYPE_WIDTH - 1)})); return dest; }); } /******************************************************************** ** ** void *memcpy(void *dest, const void *src, size_t count) ** ** Args: dest - pointer to destination buffer ** src - pointer to source buffer ** count - number of bytes to copy ** ** Return: A pointer to destination buffer ** ** Purpose: Copies count bytes from src to dest. ** No overlap check is performed. ** *******************************************************************/ void *memcpy(void *dest, const void *src, size_t count) { ubyte* dst8 = cast(ubyte*)dest; ubyte* src8 = cast(ubyte*)src; if (count < 8) { mixin(COPY_REMAINING(q{count})); return dest; } mixin(START_VAL(q{dst8})); mixin(START_VAL(q{src8})); while ((cast(UIntN)dst8 & (TYPE_WIDTH - 1)) != WHILE_DEST_BREAK) { mixin(INC_VAL(q{dst8})) = mixin(INC_VAL(q{src8})); count--; } switch ((mixin(`(cast(UIntN)src8)`~ PRE_SWITCH_ADJUST)) & (TYPE_WIDTH - 1)) { // { } required to work around DMD bug case 0: {mixin(COPY_NO_SHIFT());} break; case 1: {mixin(COPY_SHIFT(q{1}));} break; case 2: {mixin(COPY_SHIFT(q{2}));} break; case 3: {mixin(COPY_SHIFT(q{3}));} break; static if(TYPE_WIDTH > 4){ // was TYPE_WIDTH >= 4. bug in original code. case 4: {mixin(COPY_SHIFT(q{4}));} break; case 5: {mixin(COPY_SHIFT(q{5}));} break; case 6: {mixin(COPY_SHIFT(q{6}));} break; case 7: {mixin(COPY_SHIFT(q{7}));} break; } default: assert(0); } } void main(){ int[13] x = [1,2,3,4,5,6,7,8,9,0,1,2,3]; int[13] y; memcpy(y.ptr, x.ptr, x.sizeof); import std.stdio; writeln(y); }
Dec 29 2011
parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Thursday, 29 December 2011 at 23:47:08 UTC, Timon Gehr wrote:
 ** The X template
Good work, but I'm not sure if inventing a DSL to make up for the problems in D string mixins that C macros don't have qualifies as "doing it right".
Dec 29 2011
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 12/30/2011 06:58 AM, Vladimir Panteleev wrote:
 On Thursday, 29 December 2011 at 23:47:08 UTC, Timon Gehr wrote:
 ** The X template
Good work, but I'm not sure if inventing a DSL to make up for the problems in D string mixins that C macros don't have qualifies as "doing it right".
It certainly does. That is how all my code generation looks like. The fact that I am using string mixins to solve some problems shows that those are not 'problems in D string mixins'.
Dec 30 2011
next sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Friday, 30 December 2011 at 12:05:27 UTC, Timon Gehr wrote:
 On 12/30/2011 06:58 AM, Vladimir Panteleev wrote:
 On Thursday, 29 December 2011 at 23:47:08 UTC, Timon Gehr 
 wrote:
 ** The X template
Good work, but I'm not sure if inventing a DSL to make up for the problems in D string mixins that C macros don't have qualifies as "doing it right".
It certainly does. That is how all my code generation looks like. The fact that I am using string mixins to solve some problems shows that those are not 'problems in D string mixins'.
Never mind. You're right. I hadn't thought of this before (using DSL nesting to avoid breaking token nesting); it's a nice idea. I think I'll steal this for my code :)
Dec 30 2011
parent so <so so.so> writes:
On Fri, 30 Dec 2011 16:11:54 +0200, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Friday, 30 December 2011 at 12:05:27 UTC, Timon Gehr wrote:
 On 12/30/2011 06:58 AM, Vladimir Panteleev wrote:
 On Thursday, 29 December 2011 at 23:47:08 UTC, Timon Gehr wrote:
 ** The X template
Good work, but I'm not sure if inventing a DSL to make up for the problems in D string mixins that C macros don't have qualifies as "doing it right".
It certainly does. That is how all my code generation looks like. The fact that I am using string mixins to solve some problems shows that those are not 'problems in D string mixins'.
Never mind. You're right. I hadn't thought of this before (using DSL nesting to avoid breaking token nesting); it's a nice idea. I think I'll steal this for my code :)
For me, mixin sounds much more intuitive than inline for what we are trying to achieve with force-inline. If it was user friendly now that would be awesome.
Dec 30 2011
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/30/2011 4:05 AM, Timon Gehr wrote:
 It certainly does. That is how all my code generation looks like. The fact that
 I am using string mixins to solve some problems shows that those are not
 'problems in D string mixins'.
I your solution to parameterized strings is very nice. Can you write a brief article about it? This should be more widely known.
Dec 30 2011
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/30/11 12:10 PM, Walter Bright wrote:
 On 12/30/2011 4:05 AM, Timon Gehr wrote:
 It certainly does. That is how all my code generation looks like. The
 fact that
 I am using string mixins to solve some problems shows that those are not
 'problems in D string mixins'.
I your solution to parameterized strings is very nice. Can you write a brief article about it? This should be more widely known.
The idea is good, but nonhygienic: the macro's expansion picks up symbols from the expansion context. Timon, to move from good to great, you may want to add parameters to the expansion process such that you replace the argument values during expansion. Andrei
Dec 30 2011
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 12/30/2011 09:51 PM, Andrei Alexandrescu wrote:
 On 12/30/11 12:10 PM, Walter Bright wrote:
 On 12/30/2011 4:05 AM, Timon Gehr wrote:
 It certainly does. That is how all my code generation looks like. The
 fact that
 I am using string mixins to solve some problems shows that those are not
 'problems in D string mixins'.
I your solution to parameterized strings is very nice. Can you write a brief article about it? This should be more widely known.
The idea is good, but nonhygienic: the macro's expansion picks up symbols from the expansion context.
What the template 'X' currently achieves is an improvement in syntax: string generated = "foo!\""~x~"\"(\""~bar(y)~"\")"; vs string generated = mixin(X!q{ foo!" (x)"(" (bar(y))") }); i.e. it is assumed that the generated code that results in a string expression will be mixed in right away. Kenji Hara's string mixin template proposal could be pulled to be able to enforce this at the same time as improving the syntax further: mixin template X(string s){enum X = XImpl(s);} string generated = X!q{ foo!" (x)"(" (bar(y))") }
 Timon, to move from good to great, you may want to add parameters to the
 expansion process such that you replace the argument values during
 expansion.
I like this: string QUX(string param1, string op, string param2){ return mixin(X!q{ ("__"~param1) (op) (param2~"__"); }; } a lot more than this: string QUX(string param1, string op, string param2){ return mixin(X!(q{ 1 2 3 },"__"~param1, op, param2~"__")); } In an ideal world, I think the macro could be defined like this (using the new anonymous function syntax on a named function): string QUX(string param1, string op, string param2) => X!q{ ("__"~param1) (op) (param2~"__"); }; and expanded like this: mixin(QUX("foo","+","bar")); I think what you have in mind is that macros are defined similar to this: And then would be expanded like this mixin(X!(QUX, "foo", "+", "bar")); Is this better? I think it makes it more difficult to write and use such a macro, because there are no parameter names to document what the parameters are for.
Dec 30 2011
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/30/11 3:51 PM, Timon Gehr wrote:
 On 12/30/2011 09:51 PM, Andrei Alexandrescu wrote:
 On 12/30/11 12:10 PM, Walter Bright wrote:
 On 12/30/2011 4:05 AM, Timon Gehr wrote:
 It certainly does. That is how all my code generation looks like. The
 fact that
 I am using string mixins to solve some problems shows that those are
 not
 'problems in D string mixins'.
I your solution to parameterized strings is very nice. Can you write a brief article about it? This should be more widely known.
The idea is good, but nonhygienic: the macro's expansion picks up symbols from the expansion context.
What the template 'X' currently achieves is an improvement in syntax: string generated = "foo!\""~x~"\"(\""~bar(y)~"\")"; vs string generated = mixin(X!q{ foo!" (x)"(" (bar(y))") });
I understand that. But the whole system must be redesigned. Quoting from my email (please let's continue here so as to avoid duplication): The macro facility should be very simple: any compile-time string can be a macro "body". The key is the expansion facility, which replaces parameter placeholders (e.g. in the simplest instance $1, $2 etc) with actual parameters. This is missing. Also, there must be expansion of other already-defined macro names. This is already present. The library has a simple interface: enum myMacro = q{... $1 $2 $(anotherMacro($1))... }; // To mixin mixin(expand(myMacro, "argument one", "argument two")); Andrei
Dec 30 2011
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 12/31/2011 12:02 AM, Andrei Alexandrescu wrote:
 On 12/30/11 3:51 PM, Timon Gehr wrote:
 On 12/30/2011 09:51 PM, Andrei Alexandrescu wrote:
 On 12/30/11 12:10 PM, Walter Bright wrote:
 On 12/30/2011 4:05 AM, Timon Gehr wrote:
 It certainly does. That is how all my code generation looks like. The
 fact that
 I am using string mixins to solve some problems shows that those are
 not
 'problems in D string mixins'.
I your solution to parameterized strings is very nice. Can you write a brief article about it? This should be more widely known.
The idea is good, but nonhygienic: the macro's expansion picks up symbols from the expansion context.
What the template 'X' currently achieves is an improvement in syntax: string generated = "foo!\""~x~"\"(\""~bar(y)~"\")"; vs string generated = mixin(X!q{ foo!" (x)"(" (bar(y))") });
I understand that. But the whole system must be redesigned. Quoting from my email (please let's continue here so as to avoid duplication): The macro facility should be very simple: any compile-time string can be a macro "body". The key is the expansion facility, which replaces parameter placeholders (e.g. in the simplest instance $1, $2 etc) with actual parameters. This is missing. Also, there must be expansion of other already-defined macro names. This is already present. The library has a simple interface: enum myMacro = q{... $1 $2 $(anotherMacro($1))... }; // To mixin mixin(expand(myMacro, "argument one", "argument two")); Andrei
I understand, but compared to how I solved the issue 1. it invents an (arguably inferior) parameter passing system, even though there is one in the language. 2. it picks up all symbols used in $(...) from the caller's context rather than the callee's context and there is no way to get rid of that default, because the macro is unscoped.
Dec 30 2011
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/30/2011 3:21 PM, Timon Gehr wrote:
 2. it picks up all symbols used in $(...) from the caller's context rather than
 the callee's context and there is no way to get rid of that default, because
the
 macro is unscoped.
That's characteristic of how macros work, and people want it that way. Otherwise, they'd use functions or templates.
Dec 30 2011
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 12/31/2011 12:34 AM, Walter Bright wrote:
 On 12/30/2011 3:21 PM, Timon Gehr wrote:
 2. it picks up all symbols used in $(...) from the caller's context
 rather than
 the callee's context and there is no way to get rid of that default,
 because the
 macro is unscoped.
That's characteristic of how macros work, and people want it that way. Otherwise, they'd use functions or templates.
I don't think that is true. It's an undesirable characteristic. I don't think it works well together with a module system. Note that I use arbitrary CTFE inside (...). That can be correcting the case of an identifier or any implementation detail. I don't want to require any module that expands the macro to import those implementation details publicly. If I actually want to pick up identifiers from the caller's scope, that is easy: I just embed another X template instantiation.
Dec 30 2011
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/30/11 5:21 PM, Timon Gehr wrote:
 On 12/31/2011 12:02 AM, Andrei Alexandrescu wrote:
 The library has a simple interface:

 enum myMacro = q{... $1 $2 $(anotherMacro($1))... };

 // To mixin
 mixin(expand(myMacro, "argument one", "argument two"));


 Andrei
I understand, but compared to how I solved the issue 1. it invents an (arguably inferior) parameter passing system, even though there is one in the language. 2. it picks up all symbols used in $(...) from the caller's context rather than the callee's context and there is no way to get rid of that default, because the macro is unscoped.
Fair enough. I think your idea of defining a mini-macro-expansion system based on CTFE and strings is genius. I also think at the present the semantics are very unprincipled, and should not be popularized in any form lest D mixins acquire reputation similar to C macros. Finally, I think you have the resources to work your idea into a wonderful system that will be principled, practical, and extremely powerful. Good luck! Andrei
Dec 30 2011
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 12/31/2011 01:10 AM, Andrei Alexandrescu wrote:
 On 12/30/11 5:21 PM, Timon Gehr wrote:
 On 12/31/2011 12:02 AM, Andrei Alexandrescu wrote:
 The library has a simple interface:

 enum myMacro = q{... $1 $2 $(anotherMacro($1))... };

 // To mixin
 mixin(expand(myMacro, "argument one", "argument two"));


 Andrei
I understand, but compared to how I solved the issue 1. it invents an (arguably inferior) parameter passing system, even though there is one in the language. 2. it picks up all symbols used in $(...) from the caller's context rather than the callee's context and there is no way to get rid of that default, because the macro is unscoped.
Fair enough. I think your idea of defining a mini-macro-expansion system based on CTFE and strings is genius. I also think at the present the semantics are very unprincipled, and should not be popularized in any form lest D mixins acquire reputation similar to C macros.
I think what you propose is a lot closer to C macros than what I already use. Therefore I don't understand what qualifies its semantics as unprincipled.
 Finally, I think you have the resources to work your idea into a wonderful
system
 that will be principled, practical, and extremely powerful.

 Good luck!

 Andrei
I'd be happy to extend the system, but currently I don't see it fall short any of the three requirements. Can you help me out?
Dec 30 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/30/11 6:25 PM, Timon Gehr wrote:
 I'd be happy to extend the system, but currently I don't see it fall
 short any of the three requirements. Can you help me out?
I think it would be great to reproduce the expansion semantics of ddoc. Andrei
Dec 30 2011
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 12/31/2011 03:16 AM, Andrei Alexandrescu wrote:
 On 12/30/11 6:25 PM, Timon Gehr wrote:
 I'd be happy to extend the system, but currently I don't see it fall
 short any of the three requirements. Can you help me out?
I think it would be great to reproduce the expansion semantics of ddoc. Andrei
So basically, just breaking infinite recursion on recursive identical instantiations? In what way does such a feature improve the expressiveness of the macro system?
Dec 30 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/30/2011 6:59 PM, Timon Gehr wrote:
 On 12/31/2011 03:16 AM, Andrei Alexandrescu wrote:
 On 12/30/11 6:25 PM, Timon Gehr wrote:
 I'd be happy to extend the system, but currently I don't see it fall
 short any of the three requirements. Can you help me out?
I think it would be great to reproduce the expansion semantics of ddoc. Andrei
So basically, just breaking infinite recursion on recursive identical instantiations? In what way does such a feature improve the expressiveness of the macro system?
Because inevitably someone will write: #define FOO a + FOO and expect it to work (the correct expansion would be "a + FOO", not a stack overflow). The C preprocessor works this way, as do makefile macros, as Ddoc does.
Dec 30 2011
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 12/31/2011 04:50 AM, Walter Bright wrote:
 On 12/30/2011 6:59 PM, Timon Gehr wrote:
 On 12/31/2011 03:16 AM, Andrei Alexandrescu wrote:
 On 12/30/11 6:25 PM, Timon Gehr wrote:
 I'd be happy to extend the system, but currently I don't see it fall
 short any of the three requirements. Can you help me out?
I think it would be great to reproduce the expansion semantics of ddoc. Andrei
So basically, just breaking infinite recursion on recursive identical instantiations? In what way does such a feature improve the expressiveness of the macro system?
Because inevitably someone will write: #define FOO a + FOO and expect it to work (the correct expansion would be "a + FOO", not a stack overflow). The C preprocessor works this way, as do makefile macros, as Ddoc does.
Makes sense, but why is it an issue if expansion is explicit? enum FOO = q{a + FOO}; mixin(FOO);
Dec 30 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/30/2011 8:02 PM, Timon Gehr wrote:
 On 12/31/2011 04:50 AM, Walter Bright wrote:
 Because inevitably someone will write:

 #define FOO a + FOO

 and expect it to work (the correct expansion would be "a + FOO", not a
 stack overflow). The C preprocessor works this way, as do makefile
 macros, as Ddoc does.
Makes sense, but why is it an issue if expansion is explicit? enum FOO = q{a + FOO}; mixin(FOO);
Because the expanded text is then rescanned for further macro replacement.
Dec 30 2011
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 12/31/2011 05:19 AM, Walter Bright wrote:
 On 12/30/2011 8:02 PM, Timon Gehr wrote:
 On 12/31/2011 04:50 AM, Walter Bright wrote:
 Because inevitably someone will write:

 #define FOO a + FOO

 and expect it to work (the correct expansion would be "a + FOO", not a
 stack overflow). The C preprocessor works this way, as do makefile
 macros, as Ddoc does.
Makes sense, but why is it an issue if expansion is explicit? enum FOO = q{a + FOO}; mixin(FOO);
Because the expanded text is then rescanned for further macro replacement.
Yes, but in q{a + FOO} there is none.
Dec 30 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/30/2011 8:34 PM, Timon Gehr wrote:
 On 12/31/2011 05:19 AM, Walter Bright wrote:
 On 12/30/2011 8:02 PM, Timon Gehr wrote:
 On 12/31/2011 04:50 AM, Walter Bright wrote:
 Because inevitably someone will write:

 #define FOO a + FOO

 and expect it to work (the correct expansion would be "a + FOO", not a
 stack overflow). The C preprocessor works this way, as do makefile
 macros, as Ddoc does.
Makes sense, but why is it an issue if expansion is explicit? enum FOO = q{a + FOO}; mixin(FOO);
Because the expanded text is then rescanned for further macro replacement.
Yes, but in q{a + FOO} there is none.
#define FOO a+Foo FOO; What is the text after macro expansion?
Dec 30 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/30/2011 10:50 PM, Walter Bright wrote:
 On 12/30/2011 8:34 PM, Timon Gehr wrote:
 On 12/31/2011 05:19 AM, Walter Bright wrote:
 On 12/30/2011 8:02 PM, Timon Gehr wrote:
 On 12/31/2011 04:50 AM, Walter Bright wrote:
 Because inevitably someone will write:

 #define FOO a + FOO

 and expect it to work (the correct expansion would be "a + FOO", not a
 stack overflow). The C preprocessor works this way, as do makefile
 macros, as Ddoc does.
Makes sense, but why is it an issue if expansion is explicit? enum FOO = q{a + FOO}; mixin(FOO);
Because the expanded text is then rescanned for further macro replacement.
Yes, but in q{a + FOO} there is none.
#define FOO a+Foo FOO; What is the text after macro expansion?
Blast, I meant #define FOO a+FOO FOO;
Dec 30 2011
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 12/31/2011 07:50 AM, Walter Bright wrote:
 On 12/30/2011 10:50 PM, Walter Bright wrote:
 On 12/30/2011 8:34 PM, Timon Gehr wrote:
 On 12/31/2011 05:19 AM, Walter Bright wrote:
 On 12/30/2011 8:02 PM, Timon Gehr wrote:
 On 12/31/2011 04:50 AM, Walter Bright wrote:
 Because inevitably someone will write:

 #define FOO a + FOO

 and expect it to work (the correct expansion would be "a + FOO",
 not a
 stack overflow). The C preprocessor works this way, as do makefile
 macros, as Ddoc does.
Makes sense, but why is it an issue if expansion is explicit? enum FOO = q{a + FOO}; mixin(FOO);
Because the expanded text is then rescanned for further macro replacement.
Yes, but in q{a + FOO} there is none.
#define FOO a+Foo FOO; What is the text after macro expansion?
Blast, I meant #define FOO a+FOO FOO;
FOO; -> a + FOO; mixin(FOO~";"); -> a + FOO; The two do the same thing.
Dec 31 2011
prev sibling parent reply "Nick Sabalausky" <a a.a> writes:
"Timon Gehr" <timon.gehr gmx.ch> wrote in message 
news:jdlbpq$2b7e$1 digitalmars.com...
 What the template 'X' currently achieves is an improvement in syntax:

 string generated = "foo!\""~x~"\"(\""~bar(y)~"\")";
Ewww, who in the world uses double-quote strings for code containing quotes? That's not a fair comparison. This is a better comparison: string generated = `foo!"`~x~`"("`~bar(y)~`")`; vs string generated = mixin(X!q{ foo!" (x)"(" (bar(y))") });
Jan 02 2012
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 01/02/2012 11:07 PM, Nick Sabalausky wrote:
 "Timon Gehr"<timon.gehr gmx.ch>  wrote in message
 news:jdlbpq$2b7e$1 digitalmars.com...
 What the template 'X' currently achieves is an improvement in syntax:

 string generated = "foo!\""~x~"\"(\""~bar(y)~"\")";
Ewww, who in the world uses double-quote strings for code containing quotes? That's not a fair comparison. This is a better comparison: string generated = `foo!"`~x~`"("`~bar(y)~`")`; vs string generated = mixin(X!q{ foo!" (x)"(" (bar(y))") });
What if the code contains both " and `? Using `` strings for code that contains quotes is not a general solution.
Jan 02 2012
parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Monday, January 02, 2012 23:19:19 Timon Gehr wrote:
 What if the code contains both " and `? Using `` strings for code that
 contains quotes is not a general solution.
True. But it's a solution that works most of the time. - Jonathan M Davis
Jan 02 2012
prev sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 12/30/2011 07:10 PM, Walter Bright wrote:
 On 12/30/2011 4:05 AM, Timon Gehr wrote:
 It certainly does. That is how all my code generation looks like. The
 fact that
 I am using string mixins to solve some problems shows that those are not
 'problems in D string mixins'.
I your solution to parameterized strings is very nice. Can you write a brief article about it? This should be more widely known.
Ok.
Dec 30 2011
prev sibling next sibling parent reply Manu <turkeyman gmail.com> writes:
This conversation has meandered into one very specific branch, but I just
want to add my 2c to the OP.
I agree, I want D to be a useful systems language too. These are my issues
to that end:

 * __forceinline ... I wasn't aware this didn't exist... and yes, despite
all this discussion, I still depend on this all the time. People are
talking about implementing forceinline by immitating macros using mixins...
crazy? Here's a solid reason I avoid mixins or procedurally generated code
(and the preprocessor in C for that matter, in favour of __forceinline):
YOU CAN DEBUG IT. In an inline function, the code exists in the source
file, just like any other function, you can STEP THE DEBUGGER through it,
and inspect the values easily. This is an underrated requirement. I would
waste hours on many days if I couldn't do this. I would only ever use
string mixins for the most obscure uses, preferring inline functions for
the sake of debugging 99% of the time.

 * vector type ... D has exactly no way to tell the compiler to allocate
128bit vector registers, load/store, and pass then to/from functions. That
is MOST of the register memory on virtually every modern processor, and D
can't address it... wtf?

 * inline assembler needs pseudo registers ... The inline assembler is
pretty crap, immitating C which is out-dated. Registers in assembly code
should barely ever be addressed directly, they should only be addressed by
TYPE, allowing the compiler to allocate available registers (and/or manage
storing the the stack where required) as with any other code. Inline
assembly without pseudo-registers is almost always an un-optimisation, and
this is also the reason why almost all C programmers use hardware opcode
intrinsics instead of inline assembly. There is no way without using
intrinsics in C to allow the compiler to perform optimal register
allocation, and this is still true for D, and in my opinion, just plain
broken.

 * __restrict ... I've said this before, but not being able to hint that
the compiler ignore possible pointer aliasing is a big performance problem,
especially when interacting with C libs.

 * multiple return values (in registers) ... (just because I opened a topic
about it before) This saves memory accesses in common cases where i want to
return (x, y), or (retVal, errorCode) for instance.

Walter made an argument "The same goes for all those language extensions
you mentioned. Those are not part of Standard C. They are vendor
extensions. Does that mean that C is not actually a systems language? No."
This is absurd... are you saying that you expect Iain to add these things
to GDC to that people can use them, and then create incompatible D code
with the 'standard' compiler?
Why would you intentionally fragment the compiler support of language
features rather than just making trivial (but important) features that
people do use part of the language?

This is a great example of why C is shit, and a good example of why I'm
interested in D at all...

On 29 December 2011 13:19, Vladimir Panteleev
<vladimir thecybershadow.net>wrote:

 On Thursday, 29 December 2011 at 09:16:23 UTC, Walter Bright wrote:

 Are you a ridiculous hacker? Inline x86 assembly that the compiler
 actually understands in 32 AND 64 bit code, hex string literals like x"DE
 ADB EEF" where spacing doesn't matter, the ability to set data alignment
 cross-platform with type.alignof = 16, load your shellcode verbatim into a
 string like so: auto str = import("shellcode.txt");
I would like to talk about this for a bit. Personally, I think D's system programming abilities are only half-way there. Note that I am not talking about use cases in high-level application code, but rather low-level, widely-used framework code, where every bit of performance matters (for example: memory copy routines, string builders, garbage collectors). In-line assembler as part of the language is certainly neat, and in fact coming from Delphi to C++ I was surprised to learn that C++ implementations adopted different syntax for asm blocks. However, compared to some C++ compilers, it has severe limitations and is D's only trick in this alley. For one thing, there is no way to force the compiler to inline a function (like __forceinline / __attribute((always_inline)) ). This is fine for high-level code (where users are best left with PGO and "the compiler knows best"), but sucks if you need a guarantee that the function must be inlined. The guarantee isn't just about inlining heuristics, but also implementation capabilities. For example, some implementations might not be able to inline functions that use certain language features, and your code's performance could demand that such a short function must be inlined. One example of this is inlining functions containing asm blocks - IIRC DMD does not support this. The compiler should fail the build if it can't inline a function tagged with forceinline, instead of shrugging it off and failing silently, forcing users to check the disassembly every time. You may have noticed that GCC has some ridiculously complicated assembler facilities. However, they also open the way to the possibilities of writing optimal code - for example, creating custom calling conventions, or inlining assembler functions without restricting the caller's register allocation with a predetermined calling convention. In contrast, DMD is very conservative when it comes to mixing D and assembler. One time I found that putting an asm block in a function turned what were single instructions into blocks of 6 instructions each. D's lacking in this area makes it impossible to create language features that are on the level of D's compiler built-ins. For example, I have tested three memcpy implementations recently, but none of them could beat DMD's standard array slice copy (despite that in release mode it compiles to a simple memcpy call). Why? Because the overhead of using a custom memcpy routine negated its performance gains. This might have been alleviated with the presence of sane macros, but no such luck. String mixins are not the answer: trying to translate macro-heavy C code to D using string mixins is string escape hell, and we're back to the level of shell scripts. We've discussed this topic on IRC recently. From what I understood, Andrei thinks improvements in this area are not "impactful" enough, which I find worrisome. Personally, I don't think D qualifies as a true "system programming language" in light of the above. It's more of a compiled language with pointers and assembler. Before you disagree with any of the above, first (for starters) I'd like to invite you to translate Daniel Vik's C memcpy implementation to D: http://www.danielvik.com/2010/** 02/fast-memcpy-in-c.html<http://www.danielvik.com/2010/02/fas -memcpy-in-c.html>. It doesn't even use inline assembler or compiler intrinsics.
Jan 04 2012
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Manu:

  * vector type ... D has exactly no way to tell the compiler to allocate
 128bit vector registers, load/store, and pass then to/from functions. That
 is MOST of the register memory on virtually every modern processor, and D
 can't address it... wtf?
Currently the built-in vector operations of D are not optimized, their syntax and semantics has some small holes that I'd like to see fixed (it's not just a matter of implementation bugs, I also mean design bugs). So I suggest first to improve them a lot, and only later, if necessary, to introduce intrinsics. Bye, bearophile
Jan 04 2012
parent reply Manu <turkeyman gmail.com> writes:
 Manu:

  * vector type ... D has exactly no way to tell the compiler to allocate
 128bit vector registers, load/store, and pass then to/from functions.
That
 is MOST of the register memory on virtually every modern processor, and D
 can't address it... wtf?
Currently the built-in vector operations of D are not optimized, their syntax and semantics has some small holes that I'd like to see fixed (it's not just a matter of implementation bugs, I also mean design bugs). So I suggest first to improve them a lot, and only later, if necessary, to introduce intrinsics.
I'm not referring to vector OPERATIONS. I only refer to the creation of a type to identify these registers... anything more than that can be done with inline asm, hardware intrinsics, etc, but the language MUST at least expose the type to allow register allocation and parameter passing. A language defined 128bit SIMD type would be fine for basically all architectures. Even though they support different operations on these registers, the size and allocation patterns are always the same across all architectures; 128 bits, 16byte aligned, etc. This allows at minimum platform independent expression of structures containing simd data, and calling of functions passing these types as args. SSE, VMX (PPC), VFP (ARM)... they all share the same rules.
Jan 04 2012
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Manu:

 I'm not referring to vector OPERATIONS. I only refer to the creation of a
 type to identify these registers...
Please, try to step back a bit and look at this problem from a bit more distance. D has vector operations, and so far they have received only a tiny amount of love. Are you able to find some ways to solve some of your problems using a hypothetical much better implementation of D vector operations? Please, think about the possibilities of this syntax. Think about future CPU evolution with SIMD registers 128, then 256, then 512, then 1024 bits long. In theory a good compiler is able to use them with no changes in the D code that uses vector operations. Intrinsics are an additive change, adding them later is possible. But I think fixing the syntax of vector ops is more important. I have some bug reports in Bugzilla about vector ops that are sleeping there since two years or so, and they are not about implementation performance. I think the good Hara will be able to implement those syntax fixes in a matter of just one day or very few days if a consensus is reached about what actually is to be fixed in D vector ops syntax. Instead of discussing about *adding* something (register intrinsics) I suggest to discuss about what to fix about the *already present* vector op syntax. This is not a request to just you Manu, but to this whole newsgroup. Bye, bearophile
Jan 04 2012
next sibling parent Peter Alexander <peter.alexander.au gmail.com> writes:
On 5/01/12 12:42 AM, bearophile wrote:
 Manu:

 I'm not referring to vector OPERATIONS. I only refer to the creation of a
 type to identify these registers...
Please, try to step back a bit and look at this problem from a bit more distance. D has vector operations, and so far they have received only a tiny amount of love. Are you able to find some ways to solve some of your problems using a hypothetical much better implementation of D vector operations? Please, think about the possibilities of this syntax. Think about future CPU evolution with SIMD registers 128, then 256, then 512, then 1024 bits long. In theory a good compiler is able to use them with no changes in the D code that uses vector operations. Intrinsics are an additive change, adding them later is possible. But I think fixing the syntax of vector ops is more important. I have some bug reports in Bugzilla about vector ops that are sleeping there since two years or so, and they are not about implementation performance. I think the good Hara will be able to implement those syntax fixes in a matter of just one day or very few days if a consensus is reached about what actually is to be fixed in D vector ops syntax. Instead of discussing about *adding* something (register intrinsics) I suggest to discuss about what to fix about the *already present* vector op syntax. This is not a request to just you Manu, but to this whole newsgroup. Bye, bearophile
D has no alignment support, so there is no way to specify that you want a float[4] to be aligned on 16-bytes, which means there is no way for the compiler to generate code to exploit SSE well. It has to be conservative and assume unaligned. Suppose alignment support is added: alias align(16) float[4] vec4f; vec4f a, b; ... a[0] = a[3]; a[1] = a[2]; a[2] = b[0]; a[3] = b[1]; Is it reasonable to expect compilers to generate a single shuffle instruction from this? What about more complicated code like computing a dot product. What D code do I write to get the compiler to generate the expected machine code? If we get alignment support and lots of work goes into optimizing vector ops for this then we can go a long with without intrinsics, but I don't think we'll ever be able to completely remove the need for intrinsics.
Jan 04 2012
prev sibling parent reply Manu <turkeyman gmail.com> writes:
On 5 January 2012 02:42, bearophile <bearophileHUGS lycos.com> wrote:

 Manu:

 I'm not referring to vector OPERATIONS. I only refer to the creation of a
 type to identify these registers...
Please, try to step back a bit and look at this problem from a bit more distance. D has vector operations, and so far they have received only a tiny amount of love. Are you able to find some ways to solve some of your problems using a hypothetical much better implementation of D vector operations? Please, think about the possibilities of this syntax. Think about future CPU evolution with SIMD registers 128, then 256, then 512, then 1024 bits long. In theory a good compiler is able to use them with no changes in the D code that uses vector operations.
These are all fundamentally different types, like int and long.. float and double... and I certainly want a keyword to identify each of them. Even if the compiler is trying to make auto vector optimisations, you can't deny programmers explicit control to the hardware when they want/need it. Look at x86 compilers, been TRYING to perform automatic SSE optimisations for 10 years, with basically no success... do you really think you can do better then all that work by microsoft and GCC? In my experience, I've even run into a lot of VC's auto-SSE-ed code that is SLOWER than the original float code. Let's not even mention architectures that receive much less love than x86, and are arguably more important (ARM; slower, simpler processors with more demand to perform well, and not waste power) Also, D is NOT a good compiler, it's a rubbish compiler with respect to code generation. And with a community so small, it has no hope of becoming a 'good' compiler any time soon.. Even C/C++ compilers that have been around for decades used by millions have been promising optimisations that are still not available, and the ones that are come at the expense of decades of smart engineers on huge paycheques.
 Intrinsics are an additive change, adding them later is possible. But I
 think fixing the syntax of vector ops is more important. I have some bug
 reports in Bugzilla about vector ops that are sleeping there since two
 years or so, and they are not about implementation performance.
Vector ops and SIMD ops are different things. float[4] (or more realistically, float[3]) should NOT be a candidate for automatic SIMD implementation, likewise, simd_type should not have its components individually accessible. These are operations the hardware can not actually perform. So no syntax to worry about, just a type.
 I think the good Hara will be able to implement those syntax fixes in a
 matter of just one day or very few days if a consensus is reached about
 what actually is to be fixed in D vector ops syntax.
 Instead of discussing about *adding* something (register intrinsics) I
 suggest to discuss about what to fix about the *already present* vector op
 syntax. This is not a request to just you Manu, but to this whole newsgroup.
And I think this is exactly the wrong approach. A vector is NOT an array of 4 (actually, usually 3) floats. It should not appear as one. This is overly complicated and ultimately wrong way to engage this hardware. Imagine the complexity in the compiler to try and force float[4] operations into vector arithmetic vs adding a 'v128' type which actually does what people want anyway... SIMD units are not float units, they should not appear like an aggregation of float units. They have: * Different error semantics, exception handling rules, sometimes different precision... * Special alignment rules. * Special literal expression/assignment. * You can NOT access individual components at will. * May be reinterpreted at any time as float[1] float[4] double[2] short[8] char[16], etc... (up to the architecture intrinsics) * Can not be involved in conventional comparison logic (array of floats would make you think they could) *** Can NOT interact with the regular 'float' unit... Vectors as an array of floats certainly suggests that you can interact with scalar floats... I will use architecture intrinsics to operate on these regs, and put that nice and neatly behind a hardware vector type with version()'s for each architecture, and an API with a whole lot of sugar to make them nice and friendly to use. My argument is that even IF the compiler some day attempts to make vector optimisations to float[4] arrays, the raw hardware should be exposed first, and allow programmers to use it directly. This starts with a language defined (platform independant) v128 type.
Jan 05 2012
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 01/05/2012 10:02 AM, Manu wrote:
...
 Also, D is NOT a good compiler, it's a rubbish compiler with respect to
 code generation.
 [snip.]
D is not a compiler, it is a language. Furthermore it is not true that DMDs backend is rubbish and there are already more backends than just the DMC backend. DMD: DMC backend, _fast code gen_ and very pleasant to use for debug builds. GDC: GCC backend, optimizes as well as GCC. This is what I use for release. Takes about three times longer for a debug build than DMD. A lot less usable for edit-compile-test cycle than DMD. LDC: LLVM backend, implements some additional optimizations in the front end. I don't have LDC installed, but iirc it is also a lot slower than DMD. I think it would be nice if you stopped spreading FUD. You seem to have reasonable requests.
Jan 05 2012
next sibling parent reply Manu <turkeyman gmail.com> writes:
 D is not a compiler, it is a language. Furthermore it is not true that
 DMDs backend is rubbish and there are already more backends than just the
 DMC backend.
Sorry, I was generalising a little general in that claim. And where I say 'rubbish', I was drawing comparison to the maturity of C compilers for x86, which STILL have trouble and make lots of mistakes with centuries of man hours of work. DMD has inferior code gen (there was a post last night comparing some disassemblies of trivial programs with GDC), will probably never rival GCC, that's fine, it's a reference, I get that. But to say that using GDC will magically fix code gen is also false. I'm not familiar with the GCC code, so I may be wrong, but my understanding is that there is frontend work, and frontend-GCC glue work that will allow for back end optimisation (which GCC can do quite well) to work properly. This is still a lot of work for a small OSS team. I also wonder if the D language provides some opportunities for optimisation that aren't expressible in other languages, and therefore may not already have an expression in the GCC back end... so I can imagine some of future optimisations frequently discussed in this forum won't just magically appear with GCC/LLVM maturity. I can't imagine Iain and co extending the GCC back end to support some obscure D optimisations happening any time soon. The point I was making (which you seem to have missed thanks to my inflamatory comment ;), was that I don't have faith that compiler maturity will solve all these problems in prompt time, and even if they do for x86, what about for less common architectures that receive a lot less love (as is also the case in C)? I make this argument in support of the language expressing optimal constructs with ease and by default, rather than expressing some concept that feels nice to programmers, but puts a burden on the whole-program-optimiser to fix. For example, virtual-by-default RELIES on whole-program-optimisation to fix, whereas final by default has no performance implications, and will produce the best code automatically. I think it would be nice if you stopped spreading FUD. You seem to have
 reasonable requests.
Perhaps a fair request, but I only do this because after a couple of months now, I have a good measure of FUD, and I receive very little response to anything I've raised that would make me feel otherwise. The most distressing thing to me is the pattern I see where most of my more trivial (but still significant) points are outright dismissed, and the hard ones are ignored, rather than reasonably argued and putting my FUD to rest. :) I also accept that I produce very little evidence to support any of my claims, so I'm easy to ignore, but this is because all my work and experience is commercial, private, and I can't easily extract anything without wasting a lot of work time to present it... not to mention breaking NDA's. Most problem cases aren't trivial, require a large context to prove with benchmarks. I can't easily write a few lines and say "here you go". At some level I'd like to think people would accept the word of a seasoned game engine dev who's genuinely interested in adopting the language for that sort of work, but I completely understand those who are skeptical. ;)
Jan 05 2012
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/5/12 10:49 AM, Manu wrote:
 I make this argument in support of the language expressing optimal
 constructs with ease and by default, rather than expressing some concept
 that feels nice to programmers, but puts a burden on the
 whole-program-optimiser to fix.
 For example, virtual-by-default RELIES on whole-program-optimisation to
 fix, whereas final by default has no performance implications, and will
 produce the best code automatically.
Your point is well meaning. I trust you understood and internalized your options outside a language change: using final or private in interfaces and classes, using struct instead of class, switching design to static polymorphism etc. Our assessment is that these work very well, promote good class hierarchy design, require reasonably little work from the programmer, and do not need advanced compiler optimizations. The D programming language is stabilizing. Making a change of such a magnitude is not negotiable, and moreover we believe the current design is very good in that regard so we are twice as motivated to keep it. At this point you need to evaluate whether you can live with this annoyance or forgo use of the language. Thanks, Andrei
Jan 05 2012
next sibling parent reply Manu <turkeyman gmail.com> writes:
On 5 January 2012 19:35, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org>wrote:

 On 1/5/12 10:49 AM, Manu wrote:

 I make this argument in support of the language expressing optimal
 constructs with ease and by default, rather than expressing some concept
 that feels nice to programmers, but puts a burden on the
 whole-program-optimiser to fix.
 For example, virtual-by-default RELIES on whole-program-optimisation to
 fix, whereas final by default has no performance implications, and will
 produce the best code automatically.
Your point is well meaning. I trust you understood and internalized your options outside a language change: using final or private in interfaces and classes, using struct instead of class, switching design to static polymorphism etc. Our assessment is that these work very well, promote good class hierarchy design, require reasonably little work from the programmer, and do not need advanced compiler optimizations. The D programming language is stabilizing. Making a change of such a magnitude is not negotiable, and moreover we believe the current design is very good in that regard so we are twice as motivated to keep it. At this point you need to evaluate whether you can live with this annoyance or forgo use of the language.
I do realise all the implementation details you suggest. My core point is that this is dangerous. A new and/or junior programmer is not likely to know all that.. they will most probably type 'class', and then start typing methods. Why would they do anything else? Every other language I can think of trains them to do that. The D catch phrase 'the right thing is the easiest thing to do' doesn't seem to hold up here... although that does depend on your point of view on 'right', for which I've made my argument numerous times, I'll desist from here on ;) I also realise this issue is non-negotiable. I said that in a previous email, and I'm prepared to live with it (although I wonder if a compiler option would be possible?)... As said, I still felt it was important to raise this one for conversations sake, and to make sure this point of view towards issues like this are more seriously considered in future. That said, this is just one of numerous issues myself and the OP raised. I don't know why this one became the most popular for discussion... my suspicion is that is because this is the easiest of my complaints to dismiss and shut down ;) This is also the least interesting to me personally of the issues I and the OP raised (knowing it can't be changed)... I'd rather be discussing the others ;)
Jan 05 2012
parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/5/2012 10:02 AM, Manu wrote:
 That said, this is just one of numerous issues myself and the OP raised. I
don't
 know why this one became the most popular for discussion... my suspicion is
that
 is because this is the easiest of my complaints to dismiss and shut down ;)
That's a common phenomena, known as bikeshedding. Issues that are easy to understand, everyone will weigh in. The hard ones require an investment of effort to understand, and few will do it.
Jan 05 2012
prev sibling next sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
On Jan 5, 2012, at 10:02 AM, Manu wrote:
=20
 That said, this is just one of numerous issues myself and the OP =
raised. I don't know why this one became the most popular for = discussion... my suspicion is that is because this is the easiest of my = complaints to dismiss and shut down ;) It's also about the only language change among the issues you mentioned. = Most of the others are QOI issues for compiler vendors. What I've been = curious about is if you really have a need for the performance that = would be granted by these features, or if this is more of an idealistic = issue.
Jan 05 2012
parent Peter Alexander <peter.alexander.au gmail.com> writes:
On 5/01/12 7:41 PM, Sean Kelly wrote:
 On Jan 5, 2012, at 10:02 AM, Manu wrote:
 That said, this is just one of numerous issues myself and the OP raised. I
don't know why this one became the most popular for discussion... my suspicion
is that is because this is the easiest of my complaints to dismiss and shut
down ;)
It's also about the only language change among the issues you mentioned. Most of the others are QOI issues for compiler vendors. What I've been curious about is if you really have a need for the performance that would be granted by these features, or if this is more of an idealistic issue.
It's not idealistic. For example, in my current project, I have a 3x perf improvement by rewriting that function with a few hundred lines of inline asm, purely to use SIMD instructions. This is a nuisance because: (a) It's hard to maintain. I have to thoroughly document what registers I'm using for what just so that I don't forget. (b) Difficult to optimize further. I could optimize the inline assembly further by doing better scheduling of instructions, but instruction scheduling naturally messes up the organization of your code, which makes it a maintenance nightmare. (c) It's not cross platform. Luckily x86/x86_64 are similar enough that I can write the code once and patch up the differences with CTFE + string mixins. I know other parts of my code that would benefit from SIMD, but it's too much hassle to write and maintain inline assembly. If we had support for align(16) float[4] a, b; a[] += b[]; // addps on x86 Then that would solve a lot of problems, but only solves the problem when you are doing "float-like" operations (addition, multiplication etc.) There's no obvious existing syntax for doing things like shuffles, conversions, SIMD square roots, cache control etc. that would naturally match to SIMD instructions. Also, there's no way to tell the compiler whether you want to treat a float[4] as an array or a vector. Vectors are suited for data parallel execution whereas array are suited for indexing. If the compiler makes the wrong decision then you suffer heavily. Ideally, we'd introduce vector types, e.g. vec_float4, vec_int4, vec_double2 etc. These would naturally match to vector registers on CPUs and be aligned appropriately for the target platform. Elementary operations would match naturally and generate the code you expect. Shuffling and other non-elementary operations would require the use of intrinsics. // 4 vector norms in parallel vec_float4 xs, ys, zs, ws; vec_float4 lengths = vec_sqrt(xs * xs + ys * ys + zs * zs + ws * ws); On x86 w/SSE, this would ideally generate: // assuming xs in xmm0, ys in xmm1 etc. mulps xmm0, xmm0; mulps xmm1, xmm1; addps xmm0, xmm1; mulps xmm2, xmm2; addps xmm0, xmm2; mulps xmm3, xmm3; addps xmm0, xmm3; sqrtps xmm0, xmm0; On platforms that don't support the vector types natively, there's two options (1) compile error, (2) compile, replacing them with float ops. I think this is the only sensible way forward.
Jan 05 2012
prev sibling next sibling parent Manu <turkeyman gmail.com> writes:
On 5 January 2012 21:41, Sean Kelly <sean invisibleduck.org> wrote:

 On Jan 5, 2012, at 10:02 AM, Manu wrote:
 That said, this is just one of numerous issues myself and the OP raised.
I don't know why this one became the most popular for discussion... my suspicion is that is because this is the easiest of my complaints to dismiss and shut down ;) It's also about the only language change among the issues you mentioned. Most of the others are QOI issues for compiler vendors. What I've been curious about is if you really have a need for the performance that would be granted by these features, or if this is more of an idealistic issue.
I think they are *all* language requests. They could be implemented by 3rd party compilers, but these things are certainly nice to standardise in the language, or you end up with C, which is a mess I'm trying to escape. Of the topics been discussed: * vector type - I can't be expected to write a game without using the vector hardware... I'd rather use a portable standardised type than a GDC extension. Why the resistance? I haven't heard any good arguments against, other than some murmurings about float[4] as the official syntax, which I gave detailed arguments against. * inline asm supporting pseudo regs - As an extension of using the vector hardware, I'll need inline asm, and inline assembly without pseudo regs is pretty useless... it's mandatory that the compiler shedule the register allocation otherwise inline asm will most likely be an un-optimisation. If D had pseudo regs in its inline assembler, it would make it REALLY attractive for embedded systems programmers. In lieu of that, I need to use opcode intrinsics instead, which I believe GDC exposes, but again, I'm done with C and versioning (#ifdef-ing) every compiler I intend to use. Why not standardise these things? At least put the intrinsics in the standard lib... * __restrict - Not a deal breaker, but console game dev industry uses this all the time. There are countless articles on the topic (many are private or on console vendor forums). If this is not standardised in the language, GDC will still expose it I'm sure, fragmenting the language. * __forceinline - I'd rather have a proper keyword than using tricks like mixins and stuff as have been discussed. The reason for this is debugging. Code is not-inlined in debug builds, looks&feels like normal code, can still evaluate, and step like regular code... and guarantee that it inlines properly when optimised. This just saves time; no frills debugging. * multiple return values - This is a purely idealistic feature request, but would lead to some really nice optimisations while retaining very tidy code if implemented properly. Other languages support this, it's a nice modern feature... why not have it in D? I couldn't release the games I do without efficient use of vector hardware and __restrict used in appropriate places. I use them every day, and I wouldn't begin a project in D without knowing that these features are supported, or are definitely coming to the language. On fixed hardware systems, the bar is high and it's very competitive... you can't waste processor time. Trust me when I say that using __restrict appropriately might lead to twice as many particles on screen, or allowing more physics bodies, or more accurate simulation. SIMD hardware is mandatory, and will usually increase performance 2-5 times in my experience. The type of code that usually benefits the most ranges from particle simulation, collision/physics, procedural geometry/texturing, and funnily enough, memcopy ;) .. stock memcopy doesn't take advantage of 16byte simd registers for copying memory, Ill bet D doesn't either. A little while back I rewrote a bitplane compositor (raw binary munging, not a typical vector hardware job) in VMX. Very tricky, but it was around 10 times faster... which is good, because our game had to hold rock solid 60fps, and it saved the build and even allowed us to add some more nice features ;) If D can't compete with, or beat C, it won't be used in this market on high end products, though perhaps still viable on smaller/not-cutting-edge projects if productivity is considered more important. Engine programmers are thoroughly aware of code generation, and in C, tricks/techniques to coerce the compiler to generate the code you want to see are common place... and often very, very ugly. I think the industry would be very impressed and enthusiastic if D were able to generate the best possible code with conventional and elegant language semantics, without annoying tricks or proprietary compiler extensions to do so.
Jan 05 2012
prev sibling next sibling parent Manu <turkeyman gmail.com> writes:
On 5 January 2012 21:41, Sean Kelly <sean invisibleduck.org> wrote:

 On Jan 5, 2012, at 10:02 AM, Manu wrote:
 That said, this is just one of numerous issues myself and the OP raised.
I don't know why this one became the most popular for discussion... my suspicion is that is because this is the easiest of my complaints to dismiss and shut down ;) It's also about the only language change among the issues you mentioned. Most of the others are QOI issues for compiler vendors. What I've been curious about is if you really have a need for the performance that would be granted by these features, or if this is more of an idealistic issue.
I think they are *all* language requests. They could be implemented by 3rd party compilers, but these things are certainly nice to standardise in the language, or you end up with C, which is a mess I'm trying to escape. Of the topics been discussed: * vector type - I can't be expected to write a game without using the vector hardware... I'd rather use a portable standardised type than a GDC extension. Why the resistance? I haven't heard any good arguments against, other than some murmurings about float[4] as the official syntax, which I gave detailed arguments against. * inline asm supporting pseudo regs - As an extension of using the vector hardware, I'll need inline asm, and inline assembly without pseudo regs is pretty useless... it's mandatory that the compiler shedule the register allocation otherwise inline asm will most likely be an un-optimisation. If D had pseudo regs in its inline assembler, it would make it REALLY attractive for embedded systems programmers. In lieu of that, I need to use opcode intrinsics instead, which I believe GDC exposes, but again, I'm done with C and versioning (#ifdef-ing) every compiler I intend to use. Why not standardise these things? At least put the intrinsics in the standard lib... * __restrict - Not a deal breaker, but console game dev industry uses this all the time. There are countless articles on the topic (many are private or on console vendor forums). If this is not standardised in the language, GDC will still expose it I'm sure, fragmenting the language. * __forceinline - I'd rather have a proper keyword than using tricks like mixins and stuff as have been discussed. The reason for this is debugging. Code is not-inlined in debug builds, looks&feels like normal code, can still evaluate, and step like regular code... and guarantee that it inlines properly when optimised. This just saves time; no frills debugging. * multiple return values - This is a purely idealistic feature request, but would lead to some really nice optimisations while retaining very tidy code if implemented properly. Other languages support this, it's a nice modern feature... why not have it in D? I couldn't release the games I do without efficient use of vector hardware and __restrict used in appropriate places. I use them every day, and I wouldn't begin a project in D without knowing that these features are supported, or are definitely coming to the language. On fixed hardware systems, the bar is high and it's very competitive... you can't waste processor time. Trust me when I say that using __restrict appropriately might lead to twice as many particles on screen, or allowing more physics bodies, or more accurate simulation. SIMD hardware is mandatory, and will usually increase performance 2-5 times in my experience. The type of code that usually benefits the most ranges from particle simulation, collision/physics, procedural geometry/texturing, and funnily enough, memcopy ;) .. stock memcopy doesn't take advantage of 16byte simd registers for copying memory, Ill bet D doesn't either. A little while back I rewrote a bitplane compositor (raw binary munging, not a typical vector hardware job) in VMX. Very tricky, but it was around 10 times faster... which is good, because our game had to hold rock solid 60fps, and it saved the build and even allowed us to add some more nice features ;) If D can't compete with, or beat C, it won't be used in this market on high end products, though perhaps still viable on smaller/not-cutting-edge projects if productivity is considered more important. Engine programmers are thoroughly aware of code generation, and in C, tricks/techniques to coerce the compiler to generate the code you want to see are common place... and often very, very ugly. I think the industry would be very impressed and enthusiastic if D were able to generate the best possible code with conventional and elegant language semantics, without annoying tricks or proprietary compiler extensions to do so.
Jan 05 2012
prev sibling next sibling parent Sean Kelly <sean invisibleduck.org> writes:
On Jan 5, 2012, at 12:36 PM, Manu wrote:

 On 5 January 2012 21:41, Sean Kelly <sean invisibleduck.org> wrote:
 On Jan 5, 2012, at 10:02 AM, Manu wrote:
 That said, this is just one of numerous issues myself and the OP =
raised. I don't know why this one became the most popular for = discussion... my suspicion is that is because this is the easiest of my = complaints to dismiss and shut down ;)
=20
 It's also about the only language change among the issues you =
mentioned. Most of the others are QOI issues for compiler vendors. = What I've been curious about is if you really have a need for the = performance that would be granted by these features, or if this is more = of an idealistic issue.
=20
 I think they are all language requests. They could be implemented by =
3rd party compilers, but these things are certainly nice to standardise = in the language, or you end up with C, which is a mess I'm trying to = escape. It's a grey area I suppose. Half of the features you list don't change = the language, they simply allow the compiler to make optimizations it = otherwise couldn't. I suppose the D way would be to make them ' ' = prefixed and provide some means for having the compiler ignore them if = it didn't recognize them. This wouldn't fragment the language so much = as make generated code more efficient on supporting platforms.
 Of the topics been discussed:
   * vector type - I can't be expected to write a game without using =
the vector hardware... I'd rather use a portable standardised type than = a GDC extension. Why the resistance? I haven't heard any good arguments = against, other than some murmurings about float[4] as the official = syntax, which I gave detailed arguments against. Could a vector type be defined in the library? Aside from alignment, = there doesn't seem to be anything that requires compiler support. Or am = I missing something?
   * inline asm supporting pseudo regs - As an extension of using the =
vector hardware, I'll need inline asm, and inline assembly without = pseudo regs is pretty useless... it's mandatory that the compiler = shedule the register allocation otherwise inline asm will most likely be = an un-optimisation. If D had pseudo regs in its inline assembler, it = would make it REALLY attractive for embedded systems programmers. This would certainly be nice. When I drop into ASM I generally could = care less about which actual register I use. I just want to call = something specific.
     In lieu of that, I need to use opcode intrinsics instead, which I =
believe GDC exposes, but again, I'm done with C and versioning = (#ifdef-ing) every compiler I intend to use. Why not standardise these = things? At least put the intrinsics in the standard lib...
=20
   * __restrict - Not a deal breaker, but console game dev industry =
uses this all the time. There are countless articles on the topic (many = are private or on console vendor forums). If this is not standardised in = the language, GDC will still expose it I'm sure, fragmenting the = language.
=20
   * __forceinline - I'd rather have a proper keyword than using tricks =
like mixins and stuff as have been discussed. The reason for this is = debugging. Code is not-inlined in debug builds, looks&feels like normal = code, can still evaluate, and step like regular code... and guarantee = that it inlines properly when optimised. This just saves time; no frills = debugging. I'd say these are QOI issues, as above.
   * multiple return values - This is a purely idealistic feature =
request, but would lead to some really nice optimisations while = retaining very tidy code if implemented properly. Other languages = support this, it's a nice modern feature... why not have it in D? I can see the ABI rules for this getting really complicated, much like = how parameter passing rules on x64 are insanely complex compared to x32. = But I agree that it would be a nice feature to have.
 I couldn't release the games I do without efficient use of vector =
hardware and __restrict used in appropriate places. I use them every = day, and I wouldn't begin a project in D without knowing that these = features are supported, or are definitely coming to the language. I know it's a major time commitment, but the best way to realize any new = feature quickly is to create a pull request. Feature proposals have a = way of being lost if they never extend beyond this newsgroup.=
Jan 05 2012
prev sibling next sibling parent Manu <turkeyman gmail.com> writes:
On 5 January 2012 23:03, Sean Kelly <sean invisibleduck.org> wrote:

 I think they are all language requests. They could be implemented by 3rd
party compilers, but these things are certainly nice to standardise in the language, or you end up with C, which is a mess I'm trying to escape. It's a grey area I suppose. Half of the features you list don't change the language, they simply allow the compiler to make optimizations it otherwise couldn't. I suppose the D way would be to make them ' ' prefixed and provide some means for having the compiler ignore them if it didn't recognize them. This wouldn't fragment the language so much as make generated code more efficient on supporting platforms.
Precisely. That's all I want :) .. formal definition of these concepts, allowing GDC and co to feed that information through to the backend which already supports these concepts :)
 Could a vector type be defined in the library?  Aside from alignment,
 there doesn't seem to be anything that requires compiler support.  Or am I
 missing something?
I think there's a lot you're missing. Load/store patterns, register allocation, parameter argument convention (ABI details), literals and assignment, exception handling/error conditions, and alignment... The actual working functional stuff could/would be done in a library, but even then, if the opcode intrinsic names weren't standardised, it'd be an awful mess behind the scenes aggregating all the different names/terminology from each compiler implementation.
   * inline asm supporting pseudo regs - As an extension of using the
 vector hardware, I'll need inline asm, and inline assembly without pseudo
 regs is pretty useless... it's mandatory that the compiler shedule the
 register allocation otherwise inline asm will most likely be an
 un-optimisation. If D had pseudo regs in its inline assembler, it would
 make it REALLY attractive for embedded systems programmers.

 This would certainly be nice.  When I drop into ASM I generally could care
 less about which actual register I use.  I just want to call something
 specific.
It's usually actually disruptive to the program around the inline asm block if you are naming registers explicitly.. better to let the compiler assign them, and intelligently flush values to the stack if it runs out of registers. AT&T asm syntax allows this, you define 'parameter' types using some silly string, and then refer to %1, %2, %3 in place of registers in your asm code, and it will perform the register assignment. Most C compilers also don't allow program optimisation/rescheduling around inline asm blocks, this makes them useless, and I'll bet GDC suffers the same problem right now(...?)
   * __restrict - Not a deal breaker, but console game dev industry uses
 this all the time. There are countless articles on the topic (many are
 private or on console vendor forums). If this is not standardised in the
 language, GDC will still expose it I'm sure, fragmenting the language.
   * __forceinline - I'd rather have a proper keyword than using tricks
like mixins and stuff as have been discussed. The reason for this is debugging. Code is not-inlined in debug builds, looks&feels like normal code, can still evaluate, and step like regular code... and guarantee that it inlines properly when optimised. This just saves time; no frills debugging. I'd say these are QOI issues, as above.
Yup, so just standardise the names for these attributes and pass the info through to GCC's back end... done :) ... Although I'm sure __forceinline isn't so simple.
   * multiple return values - This is a purely idealistic feature
request, but would lead to some really nice optimisations while retaining very tidy code if implemented properly. Other languages support this, it's a nice modern feature... why not have it in D? I can see the ABI rules for this getting really complicated, much like how parameter passing rules on x64 are insanely complex compared to x32. But I agree that it would be a nice feature to have.
How so? Parameters are passed in a sequence of regs of the appropriate type, to a point, at which stage they get put on the stack... is x64 somehow more complicated than that? Multiple return values would use the exact same regs in reverse. There should be no side effects, the calling function has already (or has the caoability to) stored off any save regs in order to pass args in the first place.
 I couldn't release the games I do without efficient use of vector
 hardware and __restrict used in appropriate places. I use them every day,
 and I wouldn't begin a project in D without knowing that these features are
 supported, or are definitely coming to the language.

 I know it's a major time commitment, but the best way to realize any new
 feature quickly is to create a pull request.  Feature proposals have a way
 of being lost if they never extend beyond this newsgroup.
Fair call, but I don't have time to get involved in that level right now, not by a long shot. I'm just a potential customer trying to give it a fair go at this point... ;)
Jan 05 2012
prev sibling next sibling parent Iain Buclaw <ibuclaw ubuntu.com> writes:
On 5 January 2012 20:36, Manu <turkeyman gmail.com> wrote:
 On 5 January 2012 21:41, Sean Kelly <sean invisibleduck.org> wrote:
 On Jan 5, 2012, at 10:02 AM, Manu wrote:
 That said, this is just one of numerous issues myself and the OP raise=
d.
 I don't know why this one became the most popular for discussion... my
 suspicion is that is because this is the easiest of my complaints to d=
ismiss
 and shut down ;)
It's also about the only language change among the issues you mentioned. =A0Most of the others are QOI issues for compiler vendors. =A0What I've =
been
 curious about is if you really have a need for the performance that woul=
d be
 granted by these features, or if this is more of an idealistic issue.
I think they are all language requests. They could be implemented by 3rd party compilers, but these things are certainly nice to standardise in th=
e
 language, or you end up with C, which is a mess I'm trying to escape.

 Of the topics been discussed:
 =A0 * vector type - I can't be expected to write a game without using the
 vector hardware... I'd rather use a portable standardised type than a GDC
 extension. Why the resistance? I haven't heard any good arguments against=
,
 other than some=A0murmurings=A0about float[4] as the official syntax, whi=
ch I
 gave detailed arguments against.
I dabbled with vector types, none of the vector builtins are hashed out to GDC because it really does require that vector be a unique type from a normal array.
 =A0 * inline asm supporting pseudo regs - As an extension of using the ve=
ctor
 hardware, I'll need inline asm, and inline assembly without pseudo regs i=
s
 pretty useless... it's mandatory that the compiler shedule the register
 allocation otherwise inline asm will most likely be an un-optimisation. I=
f D
 had pseudo regs in its inline assembler, it would make it REALLY attracti=
ve
 for embedded systems programmers.
 =A0 =A0 In lieu of that, I need to use opcode intrinsics instead, which I
 believe GDC exposes, but again, I'm done with C and versioning (#ifdef-in=
g)
 every compiler I intend to use. Why not standardise these things? At leas=
t
 put the intrinsics in the standard lib...
This is only possible using GDC extended asm - which is really GCC asm but encapsulated in {} instead of ();
 =A0 * __restrict - Not a deal breaker, but console game dev industry uses=
this
 all the time. There are countless articles on the topic (many are private=
or
 on console vendor forums). If this is not standardised in the language, G=
DC
 will still expose it I'm sure, fragmenting the language.
I don't think D enforces any sort of aliasing rules, but it would be nice to turn on strict aliasing though...
 =A0 * __forceinline - I'd rather have a proper keyword than using tricks =
like
 mixins and stuff as have been discussed. The reason for this is debugging=
.
 Code is not-inlined in debug builds, looks&feels like normal code, can st=
ill
 evaluate, and step like regular code... and guarantee that it inlines
 properly when optimised. This just saves time; no frills debugging.
__forceinline still won't be a guarantee though. --=20 Iain Buclaw *(p < e ? p++ : p) =3D (c & 0x0f) + '0';
Jan 05 2012
prev sibling next sibling parent Sean Kelly <sean invisibleduck.org> writes:
On Jan 5, 2012, at 1:42 PM, Manu wrote:

 Most C compilers also don't allow program optimisation/rescheduling =
around inline asm blocks, this makes them useless, and I'll bet GDC = suffers the same problem right now(=85?) Not sure about GDC. At one time, I pushed for keeping "volatile" alive = so asm blocks could be labeled to tell the compiler not to optimize = across them. I recall Walter rejecting the idea because compilers = shouldn't optimize across asm blocks. This should probably be revisited = at some point.
   * __restrict - Not a deal breaker, but console game dev industry =
uses this all the time. There are countless articles on the topic (many = are private or on console vendor forums). If this is not standardised in = the language, GDC will still expose it I'm sure, fragmenting the = language.
   * __forceinline - I'd rather have a proper keyword than using =
tricks like mixins and stuff as have been discussed. The reason for this = is debugging. Code is not-inlined in debug builds, looks&feels like = normal code, can still evaluate, and step like regular code... and = guarantee that it inlines properly when optimised. This just saves time; = no frills debugging.
=20
 I'd say these are QOI issues, as above.
=20
 Yup, so just standardise the names for these attributes and pass the =
info through to GCC's back end... done :) ... Although I'm sure = __forceinline isn't so simple. I'm sure it isn't. Though as long as compilers aren't required to = support it, it's just a matter of making it work, right: ;-) =20
   * multiple return values - This is a purely idealistic feature =
request, but would lead to some really nice optimisations while = retaining very tidy code if implemented properly. Other languages = support this, it's a nice modern feature... why not have it in D?
=20
 I can see the ABI rules for this getting really complicated, much like =
how parameter passing rules on x64 are insanely complex compared to x32. = But I agree that it would be a nice feature to have.
=20
 How so? Parameters are passed in a sequence of regs of the appropriate =
type, to a point, at which stage they get put on the stack... is x64 = somehow more complicated than that?
 Multiple return values would use the exact same regs in reverse. There =
should be no side effects, the calling function has already (or has the = caoability to) stored off any save regs in order to pass args in the = first place. No, that's exactly how x64 works. But compared to x32 where everything = is simply pushed onto the stack=85 That's all I was saying.
 Fair call, but I don't have time to get involved in that level right =
now, not by a long shot.
 I'm just a potential customer trying to give it a fair go at this =
point... ;) Then please be as specific as you can :-) Don might have some = experience in this area, but I suspect Walter does not.=
Jan 05 2012
prev sibling next sibling parent Manu <turkeyman gmail.com> writes:
On 5 January 2012 23:50, Iain Buclaw <ibuclaw ubuntu.com> wrote:

   * inline asm supporting pseudo regs - As an extension of using the
vector
 hardware, I'll need inline asm, and inline assembly without pseudo regs
is
 pretty useless... it's mandatory that the compiler shedule the register
 allocation otherwise inline asm will most likely be an un-optimisation.
If D
 had pseudo regs in its inline assembler, it would make it REALLY
attractive
 for embedded systems programmers.
     In lieu of that, I need to use opcode intrinsics instead, which I
 believe GDC exposes, but again, I'm done with C and versioning
(#ifdef-ing)
 every compiler I intend to use. Why not standardise these things? At
least
 put the intrinsics in the standard lib...
This is only possible using GDC extended asm - which is really GCC asm but encapsulated in {} instead of ();
... shit. I fear this is a VERY serious problem that needs discussion and resolution. Now we have 2 competing standards of asm syntax in D... We're exactly in the same place as VisualC and GCC now. Epic fail.
Jan 05 2012
prev sibling next sibling parent Manu <turkeyman gmail.com> writes:
On 5 January 2012 23:53, Sean Kelly <sean invisibleduck.org> wrote:

 I recall Walter rejecting the idea because compilers shouldn't optimize
 across asm blocks.  This should probably be revisited at some point.
It's proven then, inline asm blocks break the optimiser... they are officially useless. This is why all C coders use intrinsics these days, and D should too.
 How so? Parameters are passed in a sequence of regs of the appropriate
type, to a point, at which stage they get put on the stack... is x64 somehow more complicated than that?
 Multiple return values would use the exact same regs in reverse. There
should be no side effects, the calling function has already (or has the caoability to) stored off any save regs in order to pass args in the firs=
t
 place.

 No, that's exactly how x64 works.  But compared to x32 where everything i=
s
 simply pushed onto the stack=E2=80=A6  That's all I was saying.
Ah yes, I forgot... x86 is such a shit architecture! ;) .. I rarely write x86 code, it's fairly pointless usually. Chips don't execute the opcodes you write anyway, they reinterpret and microcode them.. you can never know what's best, and it's different for every x86 processor/vendor :P Despite the assembly you read, x86 doesn't REALLY push all those args to the stack, they have much larger register banks internally, and use them... finally, x64 put an end to the nostalgic x86 nonsense :)
Jan 05 2012
prev sibling next sibling parent Iain Buclaw <ibuclaw ubuntu.com> writes:
On 5 January 2012 22:01, Manu <turkeyman gmail.com> wrote:
 On 5 January 2012 23:50, Iain Buclaw <ibuclaw ubuntu.com> wrote:
 =A0 * inline asm supporting pseudo regs - As an extension of using the
 vector
 hardware, I'll need inline asm, and inline assembly without pseudo reg=
s
 is
 pretty useless... it's mandatory that the compiler shedule the registe=
r
 allocation otherwise inline asm will most likely be an un-optimisation=
.
 If D
 had pseudo regs in its inline assembler, it would make it REALLY
 attractive
 for embedded systems programmers.
 =A0 =A0 In lieu of that, I need to use opcode intrinsics instead, whic=
h I
 believe GDC exposes, but again, I'm done with C and versioning
 (#ifdef-ing)
 every compiler I intend to use. Why not standardise these things? At
 least
 put the intrinsics in the standard lib...
This is only possible using GDC extended asm - which is really GCC asm but encapsulated in {} instead of ();
... shit. I fear this is a VERY serious problem that needs discussion and resolutio=
n.
 Now we have 2 competing standards of asm syntax in D...
 We're exactly in the same place as VisualC and GCC now. Epic fail.
Why? The reasoning behind is more so that you can write asm statements on all architectures, not just x86. And with GDC being a frontend of GCC, seems a natural thing to support (this has actually been in GDC since 2004, so I'm not sure why you should through all arms up about it now). --=20 Iain Buclaw *(p < e ? p++ : p) =3D (c & 0x0f) + '0';
Jan 05 2012
prev sibling next sibling parent Artur Skawina <art.08.09 gmail.com> writes:
On 01/05/12 22:50, Iain Buclaw wrote:
 I don't think D enforces any sort of aliasing rules, but it would be
 nice to turn on strict aliasing though...
-fstrict-aliasing is already turned on by default in gdc... artur
Jan 05 2012
prev sibling next sibling parent Manu <turkeyman gmail.com> writes:
On 6 January 2012 00:10, Iain Buclaw <ibuclaw ubuntu.com> wrote:

 The reasoning behind is more so that you can write asm statements on
 all architectures, not just x86. And with GDC being a frontend of GCC,
 seems a natural thing to support (this has actually been in GDC since
 2004, so I'm not sure why you should through all arms up about it
 now).
When I was first reading about D I read that the inline assembler syntax is built in and standardised in the language... and I gave a large sigh of relief. If that's not the case, there are competing asm syntax in D, well... that sucks. Am I version-ing my asm blocks for DMD and GDC now like I have to in C for VC and GCC? Surely D should settle on just one... If that happens to be the GCC syntax for compatibility, great...?
Jan 05 2012
prev sibling next sibling parent Iain Buclaw <ibuclaw ubuntu.com> writes:
On 5 January 2012 22:16, Manu <turkeyman gmail.com> wrote:
 On 6 January 2012 00:10, Iain Buclaw <ibuclaw ubuntu.com> wrote:
 The reasoning behind is more so that you can write asm statements on
 all architectures, not just x86. And with GDC being a frontend of GCC,
 seems a natural thing to support (this has actually been in GDC since
 2004, so I'm not sure why you should through all arms up about it
 now).
When I was first reading about D I read that the inline assembler syntax is built in and standardised in the language... and I gave a large sigh of relief. If that's not the case, there are competing asm syntax in D, well... that sucks. Am I version-ing my asm blocks for DMD and GDC now like I have to in C for VC and GCC? Surely D should settle on just one... If that happens to be the GCC syntax for compatibility, great...?
For all its intentions, I think D-style syntax is great, however for GDC, all lines need to be translated into GCC-equivalent syntax when emitting AST. Example - what ARM assembly would look like in D: asm { cmp R0, R1; blt Lbmax; mov R2, R0; b Lrest; Lbmax: mov R2, R1; Lrest: } In order to compile *correctly*, we must be able to tell GCC what are outputs, what are inputs, what are labels, and what gets clobbered. The backend needs to know this to ensure syntax is correct, and it doesn't try to do anything odd that may invalidate what you are trying to do. ie: Output operands, the compiler can check this that outputs are lvalue. Clobbers, tells the backend that a register is not free to use as a place to store temporary values. Labels, tells the backend that this asm block of code could jmp to a given location, meaning it should be protected from the usual dead code elimination passes. For this to work, requires the frontend to be *aware* of what assembly language it is compiling, and be able to parse it, understand it correctly. Which would not be the most pleasant of things to implement granted the number of support architectures. Converting x86 Intel syntax assembly to x86 GCC syntax assembly is enough for me. :) -- Iain Buclaw *(p < e ? p++ : p) = (c & 0x0f) + '0';
Jan 05 2012
prev sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
On Jan 5, 2012, at 2:08 PM, Manu wrote:

 On 5 January 2012 23:53, Sean Kelly <sean invisibleduck.org> wrote:
 I recall Walter rejecting the idea because compilers shouldn't =
optimize across asm blocks. This should probably be revisited at some = point.
=20
 It's proven then, inline asm blocks break the optimiser... they are =
officially useless. This is why all C coders use intrinsics these days, = and D should too. For the record, some compilers do optimize across asm blocks. It's = simply DMD/DMC that doesn't. Though the lack of "volatile" makes doing = this unsafe in D as a general rule.=
Jan 05 2012
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/5/2012 2:57 PM, Sean Kelly wrote:
 For the record, some compilers do optimize across asm blocks.  It's simply
 DMD/DMC that doesn't.  Though the lack of "volatile" makes doing this unsafe
 in D as a general rule.
dmd does keep track of register usage within asm blocks.
Jan 05 2012
parent reply Sean Kelly <sean invisibleduck.org> writes:
On Jan 5, 2012, at 3:56 PM, Walter Bright wrote:

 On 1/5/2012 2:57 PM, Sean Kelly wrote:
 For the record, some compilers do optimize across asm blocks.  It's =
simply
 DMD/DMC that doesn't.  Though the lack of "volatile" makes doing this =
unsafe
 in D as a general rule.
=20 dmd does keep track of register usage within asm blocks.
Oh right, I guess it would have to, since variables can be used by name = within asm blocks. I guess it just doesn't do code movement across asm = blocks then?=
Jan 05 2012
parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/5/2012 4:03 PM, Sean Kelly wrote:
 Oh right, I guess it would have to, since variables can be used by name
 within asm blocks.  I guess it just doesn't do code movement across asm
 blocks then?
Right. More generally, it does not do data flow analysis within an asm block, treating it as a black box that could do anything.
Jan 07 2012
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/5/2012 8:49 AM, Manu wrote:
 I also wonder if the D language provides some opportunities for optimisation
 that aren't expressible in other languages,
There are some. One that is currently being exploited by the optimizer and back end is the existence of pure functions.
 and therefore may not already have
 an expression in the GCC back end... so I can imagine some of future
 optimisations frequently discussed in this forum won't just magically appear
 with GCC/LLVM maturity. I can't imagine Iain and co extending the GCC back end
 to support some obscure D optimisations happening any time soon.
Right.
 At some level I'd like to think people would accept the word of a seasoned game
 engine dev who's genuinely interested in adopting the language for that sort of
 work, but I completely understand those who are skeptical. ;)
I'm interested in hearing more. (The virtual thing can't change.)
Jan 05 2012
parent reply Manu <turkeyman gmail.com> writes:
On 5 January 2012 23:30, Walter Bright <newshound2 digitalmars.com> wrote:

 On 1/5/2012 8:49 AM, Manu wrote:

 I also wonder if the D language provides some opportunities for
 optimisation
 that aren't expressible in other languages,
There are some. One that is currently being exploited by the optimizer and back end is the existence of pure functions.
Does GDC currently support these same optimisations, or is this a DMD special power?
Jan 05 2012
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/5/2012 1:43 PM, Manu wrote:
 On 5 January 2012 23:30, Walter Bright <newshound2 digitalmars.com
 <mailto:newshound2 digitalmars.com>> wrote:

     On 1/5/2012 8:49 AM, Manu wrote:

         I also wonder if the D language provides some opportunities for
optimisation
         that aren't expressible in other languages,


     There are some. One that is currently being exploited by the optimizer and
     back end is the existence of pure functions.


 Does GDC currently support these same optimisations, or is this a DMD special
power?
I don't know what GDC does.
Jan 05 2012
parent Iain Buclaw <ibuclaw ubuntu.com> writes:
On 6 January 2012 00:03, Walter Bright <newshound2 digitalmars.com> wrote:
 On 1/5/2012 1:43 PM, Manu wrote:
 On 5 January 2012 23:30, Walter Bright <newshound2 digitalmars.com

 <mailto:newshound2 digitalmars.com>> wrote:

 =A0 =A0On 1/5/2012 8:49 AM, Manu wrote:

 =A0 =A0 =A0 =A0I also wonder if the D language provides some opportuniti=
es for
 optimisation
 =A0 =A0 =A0 =A0that aren't expressible in other languages,


 =A0 =A0There are some. One that is currently being exploited by the opti=
mizer
 and
 =A0 =A0back end is the existence of pure functions.


 Does GDC currently support these same optimisations, or is this a DMD
 special power?
I don't know what GDC does.
GDC ties D optimisations to function attributes of GCC, so you'll have do you some reading up on the meaning. :) There are three levels of purity, these are matched to are const, pure and novops attributes. --=20 Iain Buclaw *(p < e ? p++ : p) =3D (c & 0x0f) + '0';
Jan 05 2012
prev sibling next sibling parent Iain Buclaw <ibuclaw ubuntu.com> writes:
On 5 January 2012 16:49, Manu <turkeyman gmail.com> wrote:
 D is not a compiler, it is a language. Furthermore it is not true that
 DMDs backend is rubbish and there are already more backends than just th=
e
 DMC backend.
Sorry, I was generalising a little general in that claim.=A0And where I s=
ay
 'rubbish', I was drawing comparison to the maturity of C compilers for x8=
6,
 which STILL have trouble and make lots of mistakes with centuries of man
 hours of work.
 DMD has inferior code gen (there was a post last night comparing some
 disassemblies of trivial programs with GDC), will probably never rival GC=
C,
 that's fine, it's a reference, I get that.
 But to say that using GDC will magically fix code gen is also false. I'm =
not
 familiar with the GCC code, so I may be wrong, but my understanding is th=
at
 there is frontend work, and frontend-GCC glue work that will allow for ba=
ck
 end optimisation (which GCC can do quite well) to work properly. This is
 still a lot of work for a small OSS team.
 I also wonder if the D language provides some opportunities for optimisat=
ion
 that aren't expressible in other languages, and therefore may not already
 have an expression in the GCC back end... so I can imagine some of future
 optimisations frequently discussed in this forum won't just magically app=
ear
 with GCC/LLVM maturity. I can't imagine Iain and co extending the GCC bac=
k
 end to support some obscure D optimisations happening any time soon.
Actually, it's just me. ;) So far I have come across no D optimisations that aren't supported in GCC. Infact, most of the time I find myself thinking of how I can use obscure GCC optimisation X to improve D. One example is an interesting feature of Fortran, though written with C++ in mind. Seems like something that could be right up D's street. http://www.digitalmars.com/webnews/newsgroups.php?art_group=3Ddigitalmars.D= &article_id=3D147822 --=20 Iain Buclaw *(p < e ? p++ : p) =3D (c & 0x0f) + '0';
Jan 05 2012
prev sibling next sibling parent Manu <turkeyman gmail.com> writes:
So regarding my assumptions about translating the D front end expressions
to GCC? Is that all simpler than I imagine?
Do you think GDC generates optimal code comparable to C code?

What about pure functions, can you make good on optimisations like caching
results of pure functions, moving them outside loops, etc?

On 6 January 2012 00:03, Iain Buclaw <ibuclaw ubuntu.com> wrote:

 On 5 January 2012 16:49, Manu <turkeyman gmail.com> wrote:
 D is not a compiler, it is a language. Furthermore it is not true that
 DMDs backend is rubbish and there are already more backends than just
the
 DMC backend.
Sorry, I was generalising a little general in that claim. And where I say 'rubbish', I was drawing comparison to the maturity of C compilers for
x86,
 which STILL have trouble and make lots of mistakes with centuries of man
 hours of work.
 DMD has inferior code gen (there was a post last night comparing some
 disassemblies of trivial programs with GDC), will probably never rival
GCC,
 that's fine, it's a reference, I get that.
 But to say that using GDC will magically fix code gen is also false. I'm
not
 familiar with the GCC code, so I may be wrong, but my understanding is
that
 there is frontend work, and frontend-GCC glue work that will allow for
back
 end optimisation (which GCC can do quite well) to work properly. This is
 still a lot of work for a small OSS team.
 I also wonder if the D language provides some opportunities for
optimisation
 that aren't expressible in other languages, and therefore may not already
 have an expression in the GCC back end... so I can imagine some of future
 optimisations frequently discussed in this forum won't just magically
appear
 with GCC/LLVM maturity. I can't imagine Iain and co extending the GCC
back
 end to support some obscure D optimisations happening any time soon.
Actually, it's just me. ;) So far I have come across no D optimisations that aren't supported in GCC. Infact, most of the time I find myself thinking of how I can use obscure GCC optimisation X to improve D. One example is an interesting feature of Fortran, though written with C++ in mind. Seems like something that could be right up D's street. http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=147822 -- Iain Buclaw *(p < e ? p++ : p) = (c & 0x0f) + '0';
Jan 05 2012
prev sibling next sibling parent reply Iain Buclaw <ibuclaw ubuntu.com> writes:
On 5 January 2012 22:11, Manu <turkeyman gmail.com> wrote:
 So regarding my assumptions about translating the D front end expressions to
 GCC? Is that all simpler than I imagine?
 Do you think GDC generates optimal code comparable to C code?

 What about pure functions, can you make good on optimisations like caching
 results of pure functions, moving them outside loops, etc?
I think you are confusing the pure with memoization. I could be wrong however... :) -- Iain Buclaw *(p < e ? p++ : p) = (c & 0x0f) + '0';
Jan 05 2012
next sibling parent reply Peter Alexander <peter.alexander.au gmail.com> writes:
On 5/01/12 10:17 PM, Iain Buclaw wrote:
 On 5 January 2012 22:11, Manu<turkeyman gmail.com>  wrote:
 So regarding my assumptions about translating the D front end expressions to
 GCC? Is that all simpler than I imagine?
 Do you think GDC generates optimal code comparable to C code?

 What about pure functions, can you make good on optimisations like caching
 results of pure functions, moving them outside loops, etc?
I think you are confusing the pure with memoization. I could be wrong however... :)
I think Manu is right: void foo(int x) { int[10] a; foreach (ref e; a) e = bar(x); } If bar is pure then you can safely transform this into: void foo(int x) { int[10] a; auto barx = bar(x); foreach (ref e; a) e = barx; }
Jan 05 2012
parent bearophile <bearophileHUGS lycos.com> writes:
Peter Alexander:

 void foo(int x)
 {
      int[10] a;
      foreach (ref e; a)
          e = bar(x);
 }
 
 If bar is pure then you can safely transform this into:
 
 void foo(int x)
 {
      int[10] a;
      auto barx = bar(x);
      foreach (ref e; a)
          e = barx;
 }
If bar is pure, but it throws exceptions, the two versions of the code behave differently, so it's a wrong optimization. You need bar to be pure nothrow. Moving pure nothrow functions out of loops is an easy optimization, and even simple D compilers are meant to do it. Aggressively optimizing D compilers are also free to memoize some results of pure (and probably nothrow too) functions. ------- Regarding the discussion about virtual functions, show me a D compiler able to de-virtualize very well, as the Oracle JVM does :-) Time ago I have asked LLVM devs to improve this situation for LDC, they have now fixed most of my bug reports, so I think they will eventually fix this too (maybe partially): http://llvm.org/bugs/show_bug.cgi?id=3100 Bye, bearophile
Jan 05 2012
prev sibling parent reply Peter Alexander <peter.alexander.au gmail.com> writes:
On 5/01/12 10:17 PM, Iain Buclaw wrote:
 On 5 January 2012 22:11, Manu<turkeyman gmail.com>  wrote:
 So regarding my assumptions about translating the D front end expressions to
 GCC? Is that all simpler than I imagine?
 Do you think GDC generates optimal code comparable to C code?

 What about pure functions, can you make good on optimisations like caching
 results of pure functions, moving them outside loops, etc?
I think you are confusing the pure with memoization. I could be wrong however... :)
I think Manu is right: void foo(int x) { int[10] a; foreach (ref e; a) e = bar(x); } If bar is pure then you can safely transform this into: void foo(int x) { int[10] a; auto barx = bar(x); foreach (ref e; a) e = barx; } If bar is not pure then this transformation would be unsafe.
Jan 05 2012
parent Iain Buclaw <ibuclaw ubuntu.com> writes:
On 5 January 2012 23:40, Peter Alexander <peter.alexander.au gmail.com> wro=
te:
 On 5/01/12 10:17 PM, Iain Buclaw wrote:
 On 5 January 2012 22:11, Manu<turkeyman gmail.com> =A0wrote:
 So regarding my assumptions about translating the D front end expressio=
ns
 to

 GCC? Is that all simpler than I imagine?
 Do you think GDC generates optimal code comparable to C code?

 What about pure functions, can you make good on optimisations like
 caching
 results of pure functions, moving them outside loops, etc?
I think you are confusing the pure with memoization. =A0I could be wrong however... :)
I think Manu is right: void foo(int x) { =A0 =A0int[10] a; =A0 =A0foreach (ref e; a) =A0 =A0 =A0 =A0e =3D bar(x); } If bar is pure then you can safely transform this into: void foo(int x) { =A0 =A0int[10] a; =A0 =A0auto barx =3D bar(x); =A0 =A0foreach (ref e; a) =A0 =A0 =A0 =A0e =3D barx; } If bar is not pure then this transformation would be unsafe.
Yes, it will do something like that, though the loop will be unrolled - and given that gdc supports vectorisation, I think that above example will likely be vectorised too. So off the top of my head: void foo(int x) { auto barx =3D bar(x); vector(4) vect =3D { barx, barx, barx, barx }; *[&a] =3D vect; *[&a+16] =3D vect; a[9] =3D barx; a[10] =3D barx; } Regards --=20 Iain Buclaw *(p < e ? p++ : p) =3D (c & 0x0f) + '0';
Jan 05 2012
prev sibling next sibling parent Manu <turkeyman gmail.com> writes:
On 6 January 2012 00:17, Iain Buclaw <ibuclaw ubuntu.com> wrote:

 On 5 January 2012 22:11, Manu <turkeyman gmail.com> wrote:
 So regarding my assumptions about translating the D front end
expressions to
 GCC? Is that all simpler than I imagine?
 Do you think GDC generates optimal code comparable to C code?

 What about pure functions, can you make good on optimisations like
caching
 results of pure functions, moving them outside loops, etc?
I think you are confusing the pure with memoization. I could be wrong however... :)
Umm, maybe... but I don't think so. And I don't think you just answered either of my questions ;)
Jan 05 2012
prev sibling parent Iain Buclaw <ibuclaw ubuntu.com> writes:
On 5 January 2012 22:22, Manu <turkeyman gmail.com> wrote:
 On 6 January 2012 00:17, Iain Buclaw <ibuclaw ubuntu.com> wrote:
 On 5 January 2012 22:11, Manu <turkeyman gmail.com> wrote:
 So regarding my assumptions about translating the D front end
 expressions to
 GCC? Is that all simpler than I imagine?
 Do you think GDC generates optimal code comparable to C code?

 What about pure functions, can you make good on optimisations like
 caching
 results of pure functions, moving them outside loops, etc?
I think you are confusing the pure with memoization. =A0I could be wrong however... :)
Umm, maybe... but I don't think so. And I don't think you just answered either of my questions ;)
What I meant was, pure attribute will only allow the compiler to reduce the number of times the function is called in certain circumstances. However it makes no guarantees that it will do so, only if it think it's appropriate. --=20 Iain Buclaw *(p < e ? p++ : p) =3D (c & 0x0f) + '0';
Jan 05 2012
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/5/2012 1:02 AM, Manu wrote:
 My argument is that even IF the compiler some day attempts to make vector
 optimisations to float[4] arrays, the raw hardware should be exposed first, and
 allow programmers to use it directly. This starts with a language defined
 (platform independant) v128 type.
Manu, I appreciate your expertise in this manner, which I lack. I think you've made a great case. Can you flesh this out with more specific suggestions on what language changes would work best?
Jan 05 2012
parent reply Manu <turkeyman gmail.com> writes:
On 5 January 2012 22:47, Walter Bright <newshound2 digitalmars.com> wrote:

 On 1/5/2012 1:02 AM, Manu wrote:

 My argument is that even IF the compiler some day attempts to make vector
 optimisations to float[4] arrays, the raw hardware should be exposed
 first, and
 allow programmers to use it directly. This starts with a language defined
 (platform independant) v128 type.
Manu, I appreciate your expertise in this manner, which I lack. I think you've made a great case. Can you flesh this out with more specific suggestions on what language changes would work best?
Love to. I'll give it some thorough thought. There's more details than I think most would expect...
Jan 05 2012
parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/5/2012 1:22 PM, Manu wrote:
 Love to. I'll give it some thorough thought. There's more details than I think
 most would expect...
I know the devil is in the details :-) Anyhow, please start a new topic for that one. This thread is getting too large.
Jan 05 2012
prev sibling parent reply a <a a.com> writes:
 A language defined 128bit SIMD type would be fine for basically all
 architectures. Even though they support different operations on these
 registers, the size and allocation patterns are always the same across all
 architectures; 128 bits, 16byte aligned, etc. This allows at minimum
 platform independent expression of structures containing simd data, and
 calling of functions passing these types as args.
You forgot about AVX. It uses 256 bit registers and is supported in new Intel and AMD processors.
Jan 05 2012
parent Manu <turkeyman gmail.com> writes:
On 5 January 2012 22:33, a <a a.com> wrote:

 A language defined 128bit SIMD type would be fine for basically all
 architectures. Even though they support different operations on these
 registers, the size and allocation patterns are always the same across
all
 architectures; 128 bits, 16byte aligned, etc. This allows at minimum
 platform independent expression of structures containing simd data, and
 calling of functions passing these types as args.
You forgot about AVX. It uses 256 bit registers and is supported in new Intel and AMD processors.
AVX is another type, a new 256bit type, as double is to float, and should also have a keyword ;)
Jan 05 2012
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/4/12 3:39 AM, Manu wrote:
   * __forceinline ... I wasn't aware this didn't exist... and yes,
 despite all this discussion, I still depend on this all the time. People
 are talking about implementing forceinline by immitating macros using
 mixins.... crazy? Here's a solid reason I avoid mixins or procedurally
 generated code (and the preprocessor in C for that matter, in favour of
 __forceinline): YOU CAN DEBUG IT. In an inline function, the code exists
 in the source file, just like any other function, you can STEP THE
 DEBUGGER through it, and inspect the values easily. This is an
 underrated requirement. I would waste hours on many days if I couldn't
 do this. I would only ever use string mixins for the most obscure uses,
 preferring inline functions for the sake of debugging 99% of the time.
Hmmm, I see that the other way around. D CTFE-generated macros are much easier to debug because you can actually print the code before mixing it in. If it looks like valid code... great. I think the deal with inline functions is significantly more complex. Inlining is the first step in a long pipeline of optimizations that often make the code virtually unrecognizable and impossible to map back to source in a way that's remotely understandable. Andrei
Jan 04 2012
next sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Wednesday, 4 January 2012 at 14:28:07 UTC, Andrei Alexandrescu 
wrote:
 Hmmm, I see that the other way around. D CTFE-generated macros 
 are much easier to debug because you can actually print the 
 code before mixing it in. If it looks like valid code... great.
Paging Don Clugston: would it be feasable to have the compiler remember the source position of every single char/string literal or compile-time-evaluated string expression? I'm thinking that if the compiler tracks the source of every string/char literal in the source code, all across to any manipulations, debugging CTFE-generated code would be a lot easier - the compiler would emit error messages pointing inside string literals, and debuggers could step inside code in string literals. (The one thing this doesn't allow is allowing debuggers to step through a DSL with no D code in it.) The naive implementation would store the position of every character, which would blow up the memory usage by about 13 times or so on 32-bit? (For every character, add a struct with 3 fields - char* filename; int line, column). A rope-like structure could cut down on that but possibly drastically complicating the implementation.
Jan 04 2012
next sibling parent Rainer Schuetze <r.sagitario gmx.de> writes:
On 04.01.2012 15:33, Vladimir Panteleev wrote:
 On Wednesday, 4 January 2012 at 14:28:07 UTC, Andrei Alexandrescu wrote:
 Hmmm, I see that the other way around. D CTFE-generated macros are
 much easier to debug because you can actually print the code before
 mixing it in. If it looks like valid code... great.
Paging Don Clugston: would it be feasable to have the compiler remember the source position of every single char/string literal or compile-time-evaluated string expression? I'm thinking that if the compiler tracks the source of every string/char literal in the source code, all across to any manipulations, debugging CTFE-generated code would be a lot easier - the compiler would emit error messages pointing inside string literals, and debuggers could step inside code in string literals. (The one thing this doesn't allow is allowing debuggers to step through a DSL with no D code in it.) The naive implementation would store the position of every character, which would blow up the memory usage by about 13 times or so on 32-bit? (For every character, add a struct with 3 fields - char* filename; int line, column). A rope-like structure could cut down on that but possibly drastically complicating the implementation.
A simpler way to debug CTFE generated code is to dump it to another file and redirect the debugger to this file. Here is a patch that does just that, but it is probably not up to date anymore: http://d.puremagic.com/issues/show_bug.cgi?id=5051#c4
Jan 04 2012
prev sibling next sibling parent "Martin Nowak" <dawg dawgfoto.de> writes:
Am 04.01.2012, 15:33 Uhr, schrieb Vladimir Panteleev  
<vladimir thecybershadow.net>:

 On Wednesday, 4 January 2012 at 14:28:07 UTC, Andrei Alexandrescu wrote:
 Hmmm, I see that the other way around. D CTFE-generated macros are much  
 easier to debug because you can actually print the code before mixing  
 it in. If it looks like valid code... great.
Paging Don Clugston: would it be feasable to have the compiler remember the source position of every single char/string literal or compile-time-evaluated string expression? I'm thinking that if the compiler tracks the source of every string/char literal in the source code, all across to any manipulations, debugging CTFE-generated code would be a lot easier - the compiler would emit error messages pointing inside string literals, and debuggers could step inside code in string literals. (The one thing this doesn't allow is allowing debuggers to step through a DSL with no D code in it.) The naive implementation would store the position of every character, which would blow up the memory usage by about 13 times or so on 32-bit? (For every character, add a struct with 3 fields - char* filename; int line, column). A rope-like structure could cut down on that but possibly drastically complicating the implementation.
Last time I generated big and complex mixin I let the compiler output mixins to separate files. This gives you nicer debugging and readable error lines. https://github.com/D-Programming-Language/dmd/pull/426
Jan 09 2012
prev sibling parent Manu <turkeyman gmail.com> writes:
On 9 January 2012 12:01, Martin Nowak <dawg dawgfoto.de> wrote:

 Am 04.01.2012, 15:33 Uhr, schrieb Vladimir Panteleev <
 vladimir thecybershadow.net>:

  On Wednesday, 4 January 2012 at 14:28:07 UTC, Andrei Alexandrescu wrote:
 Hmmm, I see that the other way around. D CTFE-generated macros are much
 easier to debug because you can actually print the code before mixing it
 in. If it looks like valid code... great.
Paging Don Clugston: would it be feasable to have the compiler remember the source position of every single char/string literal or compile-time-evaluated string expression? I'm thinking that if the compiler tracks the source of every string/char literal in the source code, all across to any manipulations, debugging CTFE-generated code would be a lot easier - the compiler would emit error messages pointing inside string literals, and debuggers could step inside code in string literals. (The one thing this doesn't allow is allowing debuggers to step through a DSL with no D code in it.) The naive implementation would store the position of every character, which would blow up the memory usage by about 13 times or so on 32-bit? (For every character, add a struct with 3 fields - char* filename; int line, column). A rope-like structure could cut down on that but possibly drastically complicating the implementation.
Last time I generated big and complex mixin I let the compiler output mixins to separate files. This gives you nicer debugging and readable error lines. https://github.com/D-**Programming-Language/dmd/pull/**426<https://github.com/D-Programming-Language/dmd/pull/426>
Amazing idea, this should be standard! That totally changes my feelings towards mixins :)
Jan 09 2012
prev sibling parent reply Manu <turkeyman gmail.com> writes:
On 4 January 2012 16:28, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org>wrote:

 On 1/4/12 3:39 AM, Manu wrote:

  * __forceinline ... I wasn't aware this didn't exist... and yes,
 despite all this discussion, I still depend on this all the time. People
 are talking about implementing forceinline by immitating macros using
 mixins.... crazy? Here's a solid reason I avoid mixins or procedurally

 generated code (and the preprocessor in C for that matter, in favour of
 __forceinline): YOU CAN DEBUG IT. In an inline function, the code exists
 in the source file, just like any other function, you can STEP THE
 DEBUGGER through it, and inspect the values easily. This is an
 underrated requirement. I would waste hours on many days if I couldn't
 do this. I would only ever use string mixins for the most obscure uses,
 preferring inline functions for the sake of debugging 99% of the time.
Hmmm, I see that the other way around. D CTFE-generated macros are much easier to debug because you can actually print the code before mixing it in. If it looks like valid code... great. I think the deal with inline functions is significantly more complex. Inlining is the first step in a long pipeline of optimizations that often make the code virtually unrecognizable and impossible to map back to source in a way that's remotely understandable.
It's rare to step through optimised code. You tend to debug and step in debug/unoptimised builds, where inline functions are usually not even inlined, and code flow still looks natural, and easy to follow.. This saves lots of time. C/C++ macros present the same problem of not being able to step and inspect values. Most industry programmers I work with tend to avoid macros for this reason above all others.
Jan 04 2012
parent Jerry <jlquinn optonline.net> writes:
Manu <turkeyman gmail.com> writes:
 It's rare to step through optimised code. You tend to debug and step in debug/
 unoptimised builds, where inline functions are usually not even inlined, and
 code flow still looks natural, and easy to follow.. This saves lots of time.
 C/C++ macros present the same problem of not being able to step and inspect
 values. Most industry programmers I work with tend to avoid macros for this
 reason above all others.
I do it all the time. Normally I with optimized builds because our code takes a long time to run, so if I can find the problem without going to the debug build, I save a fair bit of time. I find that you get used to the weirdnesses that show up when stepping through optimized code. I'm probably able to find problems I'm looking for 60-70% of the time without resorting to using the debug build. Jerry
Jan 05 2012
prev sibling next sibling parent Artur Skawina <art.08.09 gmail.com> writes:
On 01/04/12 16:31, Artur Skawina wrote:
 Functions attributes seems like it could be an easy, backward compatible
addition:
 
  attr(attributes...)
 
 then define some obvious generic attributes like "inline" (which is
(always|force)_inline, as it's the only one that makes sense), "noinline",
"hot", "cold" etc. This lets you write " attr(inline) int f(i){}" etc, but
doesn't help the vendor specific attr case at all, unfortunately. [2]
User defined attributes. " attr regparm=whatever_the_compiler_uses;", conditioned on version(). Would let you do " attr(regparm) int f(int i){}" in a portable way. Still not sure i like it, but could work. Would have to accept and expand to multiple attrs though... artur
Jan 04 2012
prev sibling next sibling parent reply Artur Skawina <art.08.09 gmail.com> writes:
On 01/04/12 10:39, Manu wrote:
 Walter made an argument "The same goes for all those language extensions you
mentioned. Those are not part of Standard C. They are vendor extensions. Does
that mean that C is not actually a systems language? No."
 This is absurd... are you saying that you expect Iain to add these things to
GDC to that people can use them, and then create incompatible D code with the
'standard' compiler?
Some of these things are *already* in GDC... Probably not documented and tested enough [1], but they are there. So you /can/ have function declarations such as: pragma(GNU_attribute, always_inline, flatten, hot) int fxx(int i) { ... } Now, this wouldn't be that bad, if we had a preprocessor or some kind of macro facility. But as it is, writing portable code is too expensive. (I may need to add a cpp stage for D because of issues like this, haven't decided yet...)
 Why would you intentionally fragment the compiler support of language features
rather than just making trivial (but important) features that people do use
part of the language?
There are more issues, that *will* be fixed in time, once (maybe even "if") D matures. A wiki page etc listing all the needed changes ("D next generation") would definitively be helpful. Not only to record what needs fixing, but also what to avoid. Could reduce the inevitable balkanization significantly. Functions attributes seems like it could be an easy, backward compatible addition: attr(attributes...) then define some obvious generic attributes like "inline" (which is (always|force)_inline, as it's the only one that makes sense), "noinline", "hot", "cold" etc. This lets you write " attr(inline) int f(i){}" etc, but doesn't help the vendor specific attr case at all, unfortunately. [2] artur [1] Like the "target" "tune" (sub)attribute, which is not per function, but global (ie behaves as C #pragma). That might be a gcc bug though. Also, using gcc asm in a function makes the compiler generate worse code (triggered by the /first/ asm use, next ones within a function are free). [2] The problem is what do you do when you have a lot of function/methods that need to be inlined/flattened, have a different calling convention or otherwise need to be specially marked _and_ it needs to be done differently for different compilers?...
Jan 04 2012
parent reply Jacob Carlborg <doob me.com> writes:
On 2012-01-04 16:31, Artur Skawina wrote:
 On 01/04/12 10:39, Manu wrote:
 Walter made an argument "The same goes for all those language extensions you
mentioned. Those are not part of Standard C. They are vendor extensions. Does
that mean that C is not actually a systems language? No."
 This is absurd... are you saying that you expect Iain to add these things to
GDC to that people can use them, and then create incompatible D code with the
'standard' compiler?
Some of these things are *already* in GDC... Probably not documented and tested enough [1], but they are there. So you /can/ have function declarations such as: pragma(GNU_attribute, always_inline, flatten, hot) int fxx(int i) { ... }
If you want your code to be portable (between compilers) you would need to wrap that in a version statement. -- /Jacob Carlborg
Jan 04 2012
parent reply Artur Skawina <art.08.09 gmail.com> writes:
On 01/05/12 08:19, Jacob Carlborg wrote:
 On 2012-01-04 16:31, Artur Skawina wrote:
 On 01/04/12 10:39, Manu wrote:
 Walter made an argument "The same goes for all those language extensions you
mentioned. Those are not part of Standard C. They are vendor extensions. Does
that mean that C is not actually a systems language? No."
 This is absurd... are you saying that you expect Iain to add these things to
GDC to that people can use them, and then create incompatible D code with the
'standard' compiler?
Some of these things are *already* in GDC... Probably not documented and tested enough [1], but they are there. So you /can/ have function declarations such as: pragma(GNU_attribute, always_inline, flatten, hot) int fxx(int i) { ... }
If you want your code to be portable (between compilers) you would need to wrap that in a version statement.
Exactly. Which isn't a problem if you have one or two such functions. But becomes one when you have hundreds. And different compilers use different conventions, some do not support every feature and/or need specific tweaks. Copy-and-pasting multiline "declaration attribute blocks" for every function that needs them does not really scale well. In C/C++ this is CPP territory where you solve it with a #define, and all of the magic is both hidden and easily accessible in one place. Adding support for another compiler requires only editing of that one header, not modifying practically the whole project. Let's not even think about compiler version specific tweaks (due to compiler bugs or features appearing in newer versions)... D, being in its infancy, may have been able to ignore these issues so far (having only one D frontend helps too), but w/o a std, every vendor will have to invent a way to expose non-std features. For common things such as forcing functions to be inlined, keeping them out of line, marking them as hot/cold, putting them in specific text sections etc relying on vendor extensions is not really necessary. It's bad enough that every compiler will use a different incompatible runtime, in some cases calling conventions - and consequently different shared libraries; reducing source code portability (even if just by making things harder than they should be) will lead to more balkanization... artur
Jan 05 2012
parent reply Jacob Carlborg <doob me.com> writes:
On 2012-01-05 10:54, Artur Skawina wrote:
 On 01/05/12 08:19, Jacob Carlborg wrote:
 On 2012-01-04 16:31, Artur Skawina wrote:
 On 01/04/12 10:39, Manu wrote:
 Walter made an argument "The same goes for all those language extensions you
mentioned. Those are not part of Standard C. They are vendor extensions. Does
that mean that C is not actually a systems language? No."
 This is absurd... are you saying that you expect Iain to add these things to
GDC to that people can use them, and then create incompatible D code with the
'standard' compiler?
Some of these things are *already* in GDC... Probably not documented and tested enough [1], but they are there. So you /can/ have function declarations such as: pragma(GNU_attribute, always_inline, flatten, hot) int fxx(int i) { ... }
If you want your code to be portable (between compilers) you would need to wrap that in a version statement.
Exactly. Which isn't a problem if you have one or two such functions. But becomes one when you have hundreds. And different compilers use different conventions, some do not support every feature and/or need specific tweaks. Copy-and-pasting multiline "declaration attribute blocks" for every function that needs them does not really scale well. In C/C++ this is CPP territory where you solve it with a #define, and all of the magic is both hidden and easily accessible in one place. Adding support for another compiler requires only editing of that one header, not modifying practically the whole project. Let's not even think about compiler version specific tweaks (due to compiler bugs or features appearing in newer versions)... D, being in its infancy, may have been able to ignore these issues so far (having only one D frontend helps too), but w/o a std, every vendor will have to invent a way to expose non-std features. For common things such as forcing functions to be inlined, keeping them out of line, marking them as hot/cold, putting them in specific text sections etc relying on vendor extensions is not really necessary. It's bad enough that every compiler will use a different incompatible runtime, in some cases calling conventions - and consequently different shared libraries; reducing source code portability (even if just by making things harder than they should be) will lead to more balkanization... artur
The pragma is a standard way to expose non-standard features. You just need to wrap it in version statements because the compiler will otherwise complain about unrecognized pragmas. If that's a good thing or not, I don't know. -- /Jacob Carlborg
Jan 05 2012
parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Thursday, 5 January 2012 at 13:40:20 UTC, Jacob Carlborg wrote:
 The pragma is a standard way to expose non-standard features. 
 You just need to wrap it in version statements because the 
 compiler will otherwise complain about unrecognized pragmas. If 
 that's a good thing or not, I don't know.
DMD has an -ignore switch: -ignore ignore unsupported pragmas
Jan 05 2012
parent Jacob Carlborg <doob me.com> writes:
On 2012-01-05 14:59, Vladimir Panteleev wrote:
 On Thursday, 5 January 2012 at 13:40:20 UTC, Jacob Carlborg wrote:
 The pragma is a standard way to expose non-standard features. You just
 need to wrap it in version statements because the compiler will
 otherwise complain about unrecognized pragmas. If that's a good thing
 or not, I don't know.
DMD has an -ignore switch: -ignore ignore unsupported pragmas
I had no idea. -- /Jacob Carlborg
Jan 06 2012
prev sibling next sibling parent reply Manu <turkeyman gmail.com> writes:
Oh, and virtual-by-default... completely unacceptable for a systems
language. most functions are NOT virtual, and finding the false-virtuals
while optimising will be extremely tedious and time consuming. Worse, if
libraries contain false virtuals, there's good chance I may not be able to
use said library on certain architectures (PPC, ARM in particular).
Terrible decision... completely contrary to modern hardware design and
trends. Why invent a 'new' language for 10 year old hardware?

On 4 January 2012 11:39, Manu <turkeyman gmail.com> wrote:

 This conversation has meandered into one very specific branch, but I just
 want to add my 2c to the OP.
 I agree, I want D to be a useful systems language too. These are my issues
 to that end:

  * __forceinline ... I wasn't aware this didn't exist... and yes, despite
 all this discussion, I still depend on this all the time. People are
 talking about implementing forceinline by immitating macros using mixins...
 crazy? Here's a solid reason I avoid mixins or procedurally generated code
 (and the preprocessor in C for that matter, in favour of __forceinline):
 YOU CAN DEBUG IT. In an inline function, the code exists in the source
 file, just like any other function, you can STEP THE DEBUGGER through it,
 and inspect the values easily. This is an underrated requirement. I would
 waste hours on many days if I couldn't do this. I would only ever use
 string mixins for the most obscure uses, preferring inline functions for
 the sake of debugging 99% of the time.

  * vector type ... D has exactly no way to tell the compiler to allocate
 128bit vector registers, load/store, and pass then to/from functions. That
 is MOST of the register memory on virtually every modern processor, and D
 can't address it... wtf?

  * inline assembler needs pseudo registers ... The inline assembler is
 pretty crap, immitating C which is out-dated. Registers in assembly code
 should barely ever be addressed directly, they should only be addressed by
 TYPE, allowing the compiler to allocate available registers (and/or manage
 storing the the stack where required) as with any other code. Inline
 assembly without pseudo-registers is almost always an un-optimisation, and
 this is also the reason why almost all C programmers use hardware opcode
 intrinsics instead of inline assembly. There is no way without using
 intrinsics in C to allow the compiler to perform optimal register
 allocation, and this is still true for D, and in my opinion, just plain
 broken.

  * __restrict ... I've said this before, but not being able to hint that
 the compiler ignore possible pointer aliasing is a big performance problem,
 especially when interacting with C libs.

  * multiple return values (in registers) ... (just because I opened a
 topic about it before) This saves memory accesses in common cases where i
 want to return (x, y), or (retVal, errorCode) for instance.

 Walter made an argument "The same goes for all those language extensions
 you mentioned. Those are not part of Standard C. They are vendor
 extensions. Does that mean that C is not actually a systems language? No."
 This is absurd... are you saying that you expect Iain to add these things
 to GDC to that people can use them, and then create incompatible D code
 with the 'standard' compiler?
 Why would you intentionally fragment the compiler support of language
 features rather than just making trivial (but important) features that
 people do use part of the language?

 This is a great example of why C is shit, and a good example of why I'm
 interested in D at all...

 On 29 December 2011 13:19, Vladimir Panteleev <vladimir thecybershadow.net
 wrote:
 On Thursday, 29 December 2011 at 09:16:23 UTC, Walter Bright wrote:

 Are you a ridiculous hacker? Inline x86 assembly that the compiler
 actually understands in 32 AND 64 bit code, hex string literals like x"DE
 ADB EEF" where spacing doesn't matter, the ability to set data alignment
 cross-platform with type.alignof = 16, load your shellcode verbatim into a
 string like so: auto str = import("shellcode.txt");
I would like to talk about this for a bit. Personally, I think D's system programming abilities are only half-way there. Note that I am not talking about use cases in high-level application code, but rather low-level, widely-used framework code, where every bit of performance matters (for example: memory copy routines, string builders, garbage collectors). In-line assembler as part of the language is certainly neat, and in fact coming from Delphi to C++ I was surprised to learn that C++ implementations adopted different syntax for asm blocks. However, compared to some C++ compilers, it has severe limitations and is D's only trick in this alley. For one thing, there is no way to force the compiler to inline a function (like __forceinline / __attribute((always_inline)) ). This is fine for high-level code (where users are best left with PGO and "the compiler knows best"), but sucks if you need a guarantee that the function must be inlined. The guarantee isn't just about inlining heuristics, but also implementation capabilities. For example, some implementations might not be able to inline functions that use certain language features, and your code's performance could demand that such a short function must be inlined. One example of this is inlining functions containing asm blocks - IIRC DMD does not support this. The compiler should fail the build if it can't inline a function tagged with forceinline, instead of shrugging it off and failing silently, forcing users to check the disassembly every time. You may have noticed that GCC has some ridiculously complicated assembler facilities. However, they also open the way to the possibilities of writing optimal code - for example, creating custom calling conventions, or inlining assembler functions without restricting the caller's register allocation with a predetermined calling convention. In contrast, DMD is very conservative when it comes to mixing D and assembler. One time I found that putting an asm block in a function turned what were single instructions into blocks of 6 instructions each. D's lacking in this area makes it impossible to create language features that are on the level of D's compiler built-ins. For example, I have tested three memcpy implementations recently, but none of them could beat DMD's standard array slice copy (despite that in release mode it compiles to a simple memcpy call). Why? Because the overhead of using a custom memcpy routine negated its performance gains. This might have been alleviated with the presence of sane macros, but no such luck. String mixins are not the answer: trying to translate macro-heavy C code to D using string mixins is string escape hell, and we're back to the level of shell scripts. We've discussed this topic on IRC recently. From what I understood, Andrei thinks improvements in this area are not "impactful" enough, which I find worrisome. Personally, I don't think D qualifies as a true "system programming language" in light of the above. It's more of a compiled language with pointers and assembler. Before you disagree with any of the above, first (for starters) I'd like to invite you to translate Daniel Vik's C memcpy implementation to D: http://www.danielvik.com/2010/** 02/fast-memcpy-in-c.html<http://www.danielvik.com/2010/02/fas -memcpy-in-c.html>. It doesn't even use inline assembler or compiler intrinsics.
Jan 04 2012
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/4/2012 10:53 AM, Manu wrote:
 Oh, and virtual-by-default... completely unacceptable for a systems language.
  most functions are NOT virtual, and finding the false-virtuals while
 optimising will be extremely tedious and time consuming.
The only reason to use classes in D is for polymorphic behavior - and that means virtual functions. Even so, a class member function will be called directly if it is private or marked as 'final'. An easy way to find functions that are not overridden (what you called false virtuals) is to add: final: at the top of your class definition. The compiler will give you errors for any functions that need to be virtual. If you don't want polymorphic behavior, use structs instead. Struct member functions are never virtual.
 Worse, if libraries contain false virtuals, there's good chance I may not be
 able to use said library on certain architectures (PPC, ARM in particular).
??
Jan 04 2012
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Walter:

 The only reason to use classes in D is for polymorphic behavior - and that
means
 virtual functions.
I don't agree, in some cases I use final class instances instead of heap-allocated structs even when I don't need polymorphic behaviour just to avoid pointer syntax (there is also a bit higher probability of destructors being called, compared to heap-allocated structs). In some cases I've used a final class just to be able to use a this() with no arguments :-) Bye, bearophile
Jan 04 2012
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/4/2012 3:21 PM, bearophile wrote:
 The only reason to use classes in D is for polymorphic behavior - and that
 means virtual functions.
I don't agree, in some cases I use final class instances instead of heap-allocated structs even when I don't need polymorphic behaviour just to avoid pointer syntax (there is also a bit higher probability of destructors being called, compared to heap-allocated structs). In some cases I've used a final class just to be able to use a this() with no arguments :-)
There's no reason to avoid pointer syntax, because D has: 1. ref types 2. automatic dereferencing of pointers
Jan 04 2012
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 2012-01-05 00:21, bearophile wrote:
 Walter:

 The only reason to use classes in D is for polymorphic behavior - and that
means
 virtual functions.
I don't agree, in some cases I use final class instances instead of heap-allocated structs even when I don't need polymorphic behaviour just to avoid pointer syntax (there is also a bit higher probability of destructors being called, compared to heap-allocated structs). In some cases I've used a final class just to be able to use a this() with no arguments :-) Bye, bearophile
You can get that with a static opCall for structs too. -- /Jacob Carlborg
Jan 04 2012
prev sibling next sibling parent reply Manu <turkeyman gmail.com> writes:
You just missed a big discussion on IRC about this, where I think I made
some fair points that people actually agreed with.

On 1/4/2012 10:53 AM, Manu wrote:
 Oh, and virtual-by-default... completely unacceptable for a systems
 language.
  most functions are NOT virtual, and finding the false-virtuals while
 optimising will be extremely tedious and time consuming.
The only reason to use classes in D is for polymorphic behavior - and that means virtual functions. Even so, a class member function will be called directly if it is private or marked as 'final'.
Is this true? Surely the REAL reason to use classes is to allocate using the GC? Aren't struct's allocated on the stack, and passed to functions by value? Do I need to start using the ref keyword to use GC allocated structs?
 An easy way to find functions that are not overridden (what you called
 false virtuals) is to add:

   final:

 at the top of your class definition. The compiler will give you errors for
 any functions that need to be virtual.

 If you don't want polymorphic behavior, use structs instead. Struct member
 functions are never virtual.
I have never written a class in any language where the ratio of virtual to non-virtual functions is more than 1:10 or so... requiring that one explicitly declared the vastly more common case seems crazy. The thing I'm most worried about is people forgetting to declare 'final:' on a class, or junior programmers who DON'T declare final, perhaps because they don't understand it, or perhaps because they have 1-2 true-virtuals, and the rest are just defined in the same place... This is DANGEROUS. The junior programmer problem is one that can NOT be overstated, and doesn't seem to have been considered in a few design choices. I'll bet MOST classes result in an abundance of false-virtuals, and this is extremely detrimental to performance on modern hardware (and getting worse, not better, as hardware progresses).
  Worse, if libraries contain false virtuals, there's good chance I may not
 be
 able to use said library on certain architectures (PPC, ARM in
 particular).
??
If a library makes liberal (and completely unnecessary) virtual calls to the point where it performs too poorly on some architecture; lets say ARM, or PPC (architectures that will suffer far more than x86 form virtual calls), I can no longer use this library in my project... What a stupid position to be in. The main strength of any language is its wealth of libraries available, and a bad language decision prohibiting use of libraries for absolutely no practical reason is just broken by my measure.
Jan 04 2012
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 01/05/2012 12:26 AM, Manu wrote:
 You just missed a big discussion on IRC about this, where I think I made
 some fair points that people actually agreed with.

     On 1/4/2012 10:53 AM, Manu wrote:

         Oh, and virtual-by-default... completely unacceptable for a
         systems language.
           most functions are NOT virtual, and finding the false-virtuals
         while
         optimising will be extremely tedious and time consuming.


     The only reason to use classes in D is for polymorphic behavior -
     and that means
     virtual functions. Even so, a class member function will be called
     directly if
     it is private or marked as 'final'.


 Is this true? Surely the REAL reason to use classes is to allocate using
 the GC?
You can allocate any type using the GC.
 Aren't struct's allocated on the stack, and passed to functions by
 value? Do I need to start using the ref keyword to use GC allocated structs?
No.
     An easy way to find functions that are not overridden (what you
     called false virtuals) is to add:

        final:

     at the top of your class definition. The compiler will give you
     errors for any functions that need to be virtual.

     If you don't want polymorphic behavior, use structs instead. Struct
     member
     functions are never virtual.


 I have never written a class in any language where the ratio of virtual
 to non-virtual functions is more than 1:10 or so... requiring that one
 explicitly declared the vastly more common case seems crazy.
Are you sure that is the case? In my code, most class member functions are true virtual.
Jan 04 2012
parent reply Manu <turkeyman gmail.com> writes:
On 5 January 2012 01:40, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 01/05/2012 12:26 AM, Manu wrote:

 You just missed a big discussion on IRC about this, where I think I made
 some fair points that people actually agreed with.

    On 1/4/2012 10:53 AM, Manu wrote:

        Oh, and virtual-by-default... completely unacceptable for a
        systems language.
          most functions are NOT virtual, and finding the false-virtuals
        while
        optimising will be extremely tedious and time consuming.


    The only reason to use classes in D is for polymorphic behavior -
    and that means
    virtual functions. Even so, a class member function will be called
    directly if
    it is private or marked as 'final'.


 Is this true? Surely the REAL reason to use classes is to allocate using
 the GC?
You can allocate any type using the GC. Aren't struct's allocated on the stack, and passed to functions by
 value? Do I need to start using the ref keyword to use GC allocated
 structs?
No. An easy way to find functions that are not overridden (what you
    called false virtuals) is to add:

       final:

    at the top of your class definition. The compiler will give you
    errors for any functions that need to be virtual.

    If you don't want polymorphic behavior, use structs instead. Struct
    member
    functions are never virtual.


 I have never written a class in any language where the ratio of virtual
 to non-virtual functions is more than 1:10 or so... requiring that one
 explicitly declared the vastly more common case seems crazy.
Are you sure that is the case? In my code, most class member functions are true virtual.
Here's one I'm working on right now (C++). Base class for a UI system, surely one of the most heavily polymorphic types of code one can imagine. Count the virtuals... http://pastebin.com/dLUVvFsL
Jan 04 2012
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 01/05/2012 12:54 AM, Manu wrote:
 On 5 January 2012 01:40, Timon Gehr <timon.gehr gmx.ch
 <mailto:timon.gehr gmx.ch>> wrote:

     On 01/05/2012 12:26 AM, Manu wrote:

         You just missed a big discussion on IRC about this, where I
         think I made
         some fair points that people actually agreed with.

             On 1/4/2012 10:53 AM, Manu wrote:

                 Oh, and virtual-by-default... completely unacceptable for a
                 systems language.
                   most functions are NOT virtual, and finding the
         false-virtuals
                 while
                 optimising will be extremely tedious and time consuming.


             The only reason to use classes in D is for polymorphic
         behavior -
             and that means
             virtual functions. Even so, a class member function will be
         called
             directly if
             it is private or marked as 'final'.


         Is this true? Surely the REAL reason to use classes is to
         allocate using
         the GC?


     You can allocate any type using the GC.

         Aren't struct's allocated on the stack, and passed to functions by
         value? Do I need to start using the ref keyword to use GC
         allocated structs?


     No.

             An easy way to find functions that are not overridden (what you
             called false virtuals) is to add:

                final:

             at the top of your class definition. The compiler will give you
             errors for any functions that need to be virtual.

             If you don't want polymorphic behavior, use structs instead.
         Struct
             member
             functions are never virtual.


         I have never written a class in any language where the ratio of
         virtual
         to non-virtual functions is more than 1:10 or so... requiring
         that one
         explicitly declared the vastly more common case seems crazy.


     Are you sure that is the case?
     In my code, most class member functions are true virtual.


 Here's one I'm working on right now (C++).
 Base class for a UI system, surely one of the most heavily polymorphic
 types of code one can imagine.
Apparently that is not true.
 Count the virtuals... http://pastebin.com/dLUVvFsL
9/~65 approx 1:6.
Jan 04 2012
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/4/2012 3:26 PM, Manu wrote:
 Is this true? Surely the REAL reason to use classes is to allocate using the
GC?
 Aren't struct's allocated on the stack, and passed to functions by value? Do I
 need to start using the ref keyword to use GC allocated structs?
struct S { ... } S* s = new S(); // struct is allocated on the GC
 I have never written a class in any language where the ratio of virtual to
 non-virtual functions is more than 1:10 or so... requiring that one explicitly
 declared the vastly more common case seems crazy.
I found the opposite to be true when I use OOP in C++. Either scheme is valid, saying one is crazy or prohibitive way overstates the case. (I have had a lot of bad experiences in C++ with accidentally overriding a non-virtual function. It's perfectly valid in C++, but man does your code behave bizarrely when you do it.) In any sensible class design, you're going to have to decide which functions are overrideable and which are not. There's no way around it, and no magic default.
 The thing I'm most worried about is people forgetting to declare 'final:' on a
 class, or junior programmers who DON'T declare final, perhaps because they
don't
 understand it, or perhaps because they have 1-2 true-virtuals, and the rest are
 just defined in the same place... This is DANGEROUS.
It isn't dangerous, it is just less optimal. What is dangerous is (in C++) the ability to override a non-virtual function, and the use of non-virtual destructors. It's also true that D's design makes it possible for a compiler to make direct calls if it is doing whole-program analysis and determines that there are no overrides of it.
Jan 04 2012
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Walter:

 What is dangerous is (in C++) the 
 ability to override a non-virtual function, and the use of non-virtual
destructors.
There is something left that I'd like to see D care more about, method hiding: class Foo { string name = "c1"; static void foo() {} } class Bar : Foo { string name = "c2"; static void foo() {} // silent method hiding } void main() {} class Foo { string name = "c1"; static void foo() {} } class Bar : Foo { string name = "c2"; static new void foo() {} // method hiding is now visible } void main() {} Bye, bearophile
Jan 04 2012
parent reply "Jesse Phillips" <jessekphillips+D gmail.com> writes:
On Thursday, 5 January 2012 at 01:36:44 UTC, bearophile wrote:
 Walter:

 What is dangerous is (in C++) the ability to override a 
 non-virtual function, and the use of non-virtual destructors.
There is something left that I'd like to see D care more about, method hiding: class Foo { string name = "c1"; static void foo() {} } class Bar : Foo { string name = "c2"; static void foo() {} // silent method hiding } void main() {}
Should we just disallow this? If the function wasn't static it would just override foo. Or is that changing once override is required?
Jan 04 2012
parent reply bearophile <bearophileHUGS lycos.com> writes:
Jesse Phillips:

 class Foo {
   string name = "c1";
   static void foo() {}
 }
 class Bar : Foo {
   string name = "c2";
   static void foo() {} // silent method hiding
 }
 void main() {}
Should we just disallow this?
Sometimes it's an useful idiom, and probably some D code in the wild is using it already, so I don't think we should disallow it. I was just asking to force it to be syntactically explicit, just like override will do in D2. It seems Delphi too does the same thing using a different keyword (this is not too much surprising, the language designers are partially the same). So far I have seen no arguments against the requirement (initially just a warning if you compile with -w) to use a keywords as "new" there, while I have
 If the function wasn't static it would just override foo.
 Or is that changing once override is required?
Override usage is going to be (hopefully soon) compulsory in D (currently you need -w to see an error). So that code without both static and override is going to be refused :-) Bye, bearophile
Jan 05 2012
parent reply "Jesse Phillips" <jessekphillips+D gmail.com> writes:
On Thursday, 5 January 2012 at 23:12:21 UTC, bearophile wrote:
 Override usage is going to be (hopefully soon) compulsory in D 
 (currently you need -w to see an error). So that code without 
 both static and override is going to be refused :-)

 Bye,
 bearophile
I guess the question I was getting at, currently there is no way with 'new.' Is that intended once 'override' is required? and if not why have 'new' usable for static methods?
Jan 05 2012
parent bearophile <bearophileHUGS lycos.com> writes:
Jesse Phillips:

 currently there is no way 

 with 'new.' Is that intended once 'override' is required? and if 
 not why have 'new' usable for static methods?
As far as I know no other changes are planned linked to the introduction of compulsory 'override'. For the other questions probably Walter or Andrei are able to give you much better answer than me. Bye, bearophile
Jan 05 2012
prev sibling parent reply Manu <turkeyman gmail.com> writes:
 The thing I'm most worried about is people forgetting to declare 'final:'
 on a
 class, or junior programmers who DON'T declare final, perhaps because
 they don't
 understand it, or perhaps because they have 1-2 true-virtuals, and the
 rest are
 just defined in the same place... This is DANGEROUS.
It isn't dangerous, it is just less optimal. What is dangerous is (in C++) the ability to override a non-virtual function, and the use of non-virtual destructors.
In 15 years I have never once overridden a non-virtual function, assuming it was virtual, and wondering why it didn't work... have you? I've never even heard a story of a colleague, or even on the net of that ever happening (yes, I'm sure if I google specifically for it, I could find it, but it's never appeared is an article or such)... but I can point you at almost daily examples of junior programmers making silly mistakes that go un-noticed by their seniors. Especially common are mistakes in declaration where declaration attributes don't change whether the program builds and works or not. It seems to me the decision is that of sacrificing a real and common problem case with frequent and tangible evidence, for the feeling that the language is defined to do the 'right' thing?
 It's also true that D's design makes it possible for a compiler to make
 direct calls if it is doing whole-program analysis and determines that
 there are no overrides of it.
This is only possible with whole program optimisation, and some very crafty code that may or may not ever be implemented, and certainly isn't dependable from compiler vendor 'x'.. There would simply be no problem in the first place if the default was declared the other way around, and the compiler would need none of that extra code, and there are no problems of compiler maturity. Surely this sort of consideration is even more important for an open source project with a relatively small team like D than it is even for C++?
Jan 05 2012
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/5/2012 1:13 AM, Manu wrote:
 In 15 years I have never once overridden a non-virtual function, assuming it
was
 virtual, and wondering why it didn't work... have you?
Yes, I have. With a complex inheritance hierarchy, I was doing the optimization thing and removing 'virtual' from members that didn't get overridden. Then, some long time later, I'd add an override and forget to put the 'virtual' back on the original. Very strange behavior of my program would result. Even worse, frankly I defy anyone to look at a complex C++ inheritance hierarchy and say with certainty that you've verified that there are no overrides of non-virtual functions in it.
 I've never even heard a story of a colleague, or even on the net of that ever
 happening (yes, I'm sure if I google specifically for it, I could find it, but
 it's never appeared is an article or such)... but I can point you at almost
 daily examples of junior programmers making silly mistakes that go un-noticed
by
 their seniors. Especially common are mistakes in declaration where declaration
 attributes don't change whether the program builds and works or not.
That is the case with overriding a non-virtual function - the compiler will compile it anyway, and most of the time it will work. That's what makes it so eeevil.
 It seems to me the decision is that of sacrificing a real and common problem
 case with frequent and tangible evidence, for the feeling that the language is
 defined to do the 'right' thing?
The right thing should be the default.
     It's also true that D's design makes it possible for a compiler to make
     direct calls if it is doing whole-program analysis and determines that
there
     are no overrides of it.


 This is only possible with whole program optimisation, and some very crafty
code
 that may or may not ever be implemented, and certainly isn't dependable from
 compiler vendor 'x'.. There would simply be no problem in the first place if
the
 default was declared the other way around, and the compiler would need none of
 that extra code, and there are no problems of compiler maturity.
 Surely this sort of consideration is even more important for an open source
 project with a relatively small team like D than it is even for C++?
I feel the correct decision was made. But regardless, there's no way to reverse that decision, as it will break most every D program in existence, and be a HUGE annoyance to everyone who has D code.
Jan 05 2012
next sibling parent reply Manu <turkeyman gmail.com> writes:
 That is the case with overriding a non-virtual function - the compiler
 will compile it anyway, and most of the time it will work. That's what
 makes it so eeevil.
I saw today, or last night, someone suggesting a keyword to make non-virtual override explicit, and error otherwise. Which actually sounded like a really good idea to me, and also addresses this problem. I think a combination of not-virtual-by-default, and an explicit non-virtual override keyword would cover your concern, and also minimise the use of virtual functions. Sounds perfect to me ;) Overriding a non-virtual is actually very rare, and probably often unintended... I really like the idea of a keyword to make this rare use explicit.
 It seems to me the decision is that of sacrificing a real and common
 problem
 case with frequent and tangible evidence, for the feeling that the
 language is
 defined to do the 'right' thing?
The right thing should be the default.
But I fundamentally disagree your choice is 'right'.. This is obviously subjective, so I don't think that's a fair assertion. The problem was obviously not completely defined, and not addressed entirely.. I think the proposal above sounds like a better solution all round, it addresses everyones concerns, and adds a nice little safety bonus for rare non-virtual overriding ;) But as I've previously said, I understand this can't change now, I've let it go :P
Jan 05 2012
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/5/2012 1:03 PM, Manu wrote:
     That is the case with overriding a non-virtual function - the compiler will
     compile it anyway, and most of the time it will work. That's what makes it
     so eeevil.


 I saw today, or last night, someone suggesting a keyword to make non-virtual
 override explicit, and error otherwise. Which actually sounded like a really
 good idea to me, and also addresses this problem.
That's correct, it does address it. But not for C++.
     The right thing should be the default.
 But I fundamentally disagree your choice is 'right'..
Sure.
 This is obviously subjective, so I don't think that's a fair assertion.
By 'right', I don't necessarily mean 'the most efficient'. I mean that the code should be correct. It's ok if extra work is involved in creating the most efficient version. For example: int a; automatically initializes a to zero. This is correct. If you want it to remain uninitialized, int a = void; which will be faster in the cases where the compiler cannot optimize away a redundant initialization of a. But, it is dangerous because the compiler cannot always prove that a is initialized before use, hence it is not the default.
 But as I've previously said, I understand this can't change now, I've let it
go :P
I understand, I'm just explaining my point of view, and you're just explaining yours.
Jan 05 2012
parent Manu <turkeyman gmail.com> writes:
    The right thing should be the default.
 But I fundamentally disagree your choice is 'right'..
Sure. This is obviously subjective, so I don't think that's a fair assertion.

 By 'right', I don't necessarily mean 'the most efficient'. I mean that the
 code should be correct. It's ok if extra work is involved in creating the
 most efficient version.
But this solution is equally correct, and doesn't make any sacrifice for the most efficient version: methods are not virtual by default. overriding any common method is an error (great, now I know if I've made any sort of mistake) a method declared virtual may be overridden expected, and virtual-ness safely confirmed by the lack of compile error. to override a regular method (a rare thing to do, but still your primary safety concern), you use an explicit keyword to do it. now it's absolutely intentional. This provides all the same safety guarantees, ie, your 'right'-ness, and doesn't sacrifice: performance/false-virtual risk, 'final' keyword spam, risk of forgetfulness and the junior coder factor... surely this is MORE 'right', by any measure? :)
Jan 05 2012
prev sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
On Jan 5, 2012, at 1:03 PM, Manu wrote:

 That is the case with overriding a non-virtual function - the compiler =
will compile it anyway, and most of the time it will work. That's what = makes it so eeevil.
=20
 I saw today, or last night, someone suggesting a keyword to make =
non-virtual override explicit, and error otherwise. Which actually = sounded like a really good idea to me, and also addresses this problem. I think the override keyword fits here, though in reverse.=
Jan 05 2012
parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Thursday, 5 January 2012 at 21:05:07 UTC, Sean Kelly wrote:
 On Jan 5, 2012, at 1:03 PM, Manu wrote:

 That is the case with overriding a non-virtual function - the 
 compiler will compile it anyway, and most of the time it will 
 work. That's what makes it so eeevil.
 
 I saw today, or last night, someone suggesting a keyword to 
 make non-virtual override explicit, and error otherwise. Which 
 actually sounded like a really good idea to me, and also 
 addresses this problem.
I think the override keyword fits here, though in reverse.
Jan 05 2012
prev sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
On Jan 4, 2012, at 3:26 PM, Manu wrote:
=20
 If a library makes liberal (and completely unnecessary) virtual calls =
to the point where it performs too poorly on some architecture; lets say = ARM, or PPC (architectures that will suffer far more than x86 form = virtual calls), I can no longer use this library in my project... What a = stupid position to be in. The main strength of any language is its = wealth of libraries available, and a bad language decision prohibiting = use of libraries for absolutely no practical reason is just broken by my = measure. If a library is written without consideration to what is virtual and = what is not, its performance will be the least of your problems. Either = way, this ship has long since sailed. The impact of reversing this = setting would be enormous.=
Jan 04 2012
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/4/2012 4:30 PM, Sean Kelly wrote:
 If a library is written without consideration to what is virtual and what is
 not, its performance will be the least of your problems.
I agree. Such is a massive failure in designing a polymorphic type, and the language can't help with that.
Jan 04 2012
parent reply Manu <turkeyman gmail.com> writes:
On 5 January 2012 03:06, Walter Bright <newshound2 digitalmars.com> wrote:

 On 1/4/2012 4:30 PM, Sean Kelly wrote:

 If a library is written without consideration to what is virtual and what
 is
 not, its performance will be the least of your problems.
I agree. Such is a massive failure in designing a polymorphic type, and the language can't help with that.
I don't follow.. how is someone failing (or forgetting) to type 'final' a "massive design failure"? It's not a design failure, it's not even 'wrong'... it's INEVITABLE. And the language CAN help with that, by making expensive operations require explicit declaration. At least make a compiler flag so I can disable virtual-by-default for my project...?
Jan 05 2012
parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/5/2012 1:16 AM, Manu wrote:
 On 5 January 2012 03:06, Walter Bright <newshound2 digitalmars.com
 <mailto:newshound2 digitalmars.com>> wrote:

     On 1/4/2012 4:30 PM, Sean Kelly wrote:

         If a library is written without consideration to what is virtual and
what is
         not, its performance will be the least of your problems.


     I agree. Such is a massive failure in designing a polymorphic type, and the
     language can't help with that.


 I don't follow.. how is someone failing (or forgetting) to type 'final' a
 "massive design failure"? It's not a design failure, it's not even 'wrong'...
 it's INEVITABLE.
 And the language CAN help with that, by making expensive operations require
 explicit declaration.
In any class design, one must decide which functions are overrideable and which are not. The language cannot do it for you; certainly not by switching around the default behavior.
 At least make a compiler flag so I can disable virtual-by-default for my 
project...? I'm afraid that such a switch would have disastrous results, because it fundamentally alters the meaning of existing code.
Jan 05 2012
prev sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 01/04/2012 07:53 PM, Manu wrote:
 Oh, and virtual-by-default... completely unacceptable for a systems
 language. most functions are NOT virtual, and finding the false-virtuals
 while optimising will be extremely tedious and time consuming. Worse, if
 libraries contain false virtuals, there's good chance I may not be able
 to use said library on certain architectures (PPC, ARM in particular).
 Terrible decision... completely contrary to modern hardware design and
 trends. Why invent a 'new' language for 10 year old hardware?
If you don't need virtual functions don't use classes.
Jan 04 2012
parent Manu <turkeyman gmail.com> writes:
On 5 January 2012 01:17, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 01/04/2012 07:53 PM, Manu wrote:

 Oh, and virtual-by-default... completely unacceptable for a systems
 language. most functions are NOT virtual, and finding the false-virtuals
 while optimising will be extremely tedious and time consuming. Worse, if

 libraries contain false virtuals, there's good chance I may not be able
 to use said library on certain architectures (PPC, ARM in particular).
 Terrible decision... completely contrary to modern hardware design and
 trends. Why invent a 'new' language for 10 year old hardware?
If you don't need virtual functions don't use classes.
Polymorphism isn't the only difference by a long shot. Allocation and referencing patterns are totally different. I don't feel this is a reasonable counter-argument.
Jan 04 2012
prev sibling next sibling parent Iain Buclaw <ibuclaw ubuntu.com> writes:
On 4 January 2012 09:39, Manu <turkeyman gmail.com> wrote:
 This conversation has meandered into one very specific branch, but I just
 want to add my 2c to the OP.
 I agree, I want D to be a useful systems language too. These are my issue=
s
 to that end:

 =A0* __forceinline ... I wasn't aware this didn't exist... and yes, despi=
te
 all this discussion, I still depend on this all the time. People are talk=
ing
 about implementing forceinline by immitating macros using mixins... crazy=
?
 Here's a solid reason I avoid mixins or procedurally generated code (and =
the
 preprocessor in C for that matter, in favour of __forceinline): YOU CAN
 DEBUG IT. In an inline function, the code exists in the source file, just
 like any other function, you can STEP THE DEBUGGER through it, and inspec=
t
 the values easily. This is an underrated requirement. I would waste hours=
on
 many days if I couldn't do this. I would only ever use string mixins for =
the
 most obscure uses, preferring inline functions for the sake of debugging =
99%
 of the time.

 =A0* vector type ... D has exactly no way to tell the compiler to allocat=
e
 128bit vector registers, load/store, and pass then to/from functions. Tha=
t
 is MOST of the register memory on virtually every modern processor, and D
 can't address it... wtf?

 =A0* inline assembler needs pseudo registers ... The inline assembler is
 pretty crap, immitating C which is out-dated. Registers in assembly code
 should barely ever be addressed directly, they should only be addressed b=
y
 TYPE, allowing the compiler to allocate available registers (and/or manag=
e
 storing the the stack where required) as with any other code. Inline
 assembly without pseudo-registers is almost always an un-optimisation, an=
d
 this is also the reason why almost all C programmers use hardware opcode
 intrinsics instead of inline assembly. There is no way without using
 intrinsics in C to allow the compiler to perform optimal register
 allocation, and this is still true for D, and in my opinion, just plain
 broken.

 =A0* __restrict ... I've said this before, but not being able to hint tha=
t the
 compiler ignore possible pointer aliasing is a big performance problem,
 especially when interacting with C libs.

 =A0* multiple return values (in registers) ... (just because I opened a t=
opic
 about it before) This saves memory accesses in common cases where i want =
to
 return (x, y), or (retVal, errorCode) for instance.

 Walter made an argument "The same goes for all those language extensions =
you
 mentioned. Those are not part of Standard C. They are vendor extensions.
 Does that mean that C is not actually a systems language? No."
 This is absurd... are you saying that you expect Iain to add these things=
to
 GDC to that people can use them, and then create incompatible D code with
 the 'standard' compiler?
 Why would you intentionally fragment the compiler support of language
 features rather than just making trivial (but important) features that
 people do use part of the language?
Code that gdc emits is incompatible with the standard D compiler, if that's what you want to call it, and any vendor extensions won't contribute to that being more of the case. Regardless, there is little reason to want to use a forced inline with gdc. Just like in c++ when you define all methods in the class definition, gdc considers all methods as candidates for inlining. Similarly, when -inline is passed, the same is also done for normal functions that are considered inlinable by the frontend. These functions marked as inline are treated in the same way as a function declared 'inline' in C or C++, and will be treated as such by the backend. --=20 Iain Buclaw *(p < e ? p++ : p) =3D (c & 0x0f) + '0';
Jan 04 2012
prev sibling next sibling parent Andrew Wiley <wiley.andrew.j gmail.com> writes:
On Wed, Jan 4, 2012 at 12:53 PM, Manu <turkeyman gmail.com> wrote:
 Oh, and virtual-by-default... completely unacceptable for a systems
 language. most functions are NOT virtual, and finding the false-virtuals
 while optimising will be extremely tedious and time consuming. Worse, if
 libraries contain false virtuals, there's good chance I may not be able to
 use said library on certain architectures (PPC, ARM in particular). Terrible
 decision... completely contrary to modern hardware design and trends. Why
 invent a 'new' language for 10 year old hardware?
The only benchmark of virtual functions on ARM that I can find is http://mikeash.com/pyblog/performance-comparisons-of-common-operations-iphone-edition.html , which found that the calls, when compared with other operations, performed similarly to x86. I'm not really sure what architecture-specific issues you're referring to here.
Jan 05 2012
prev sibling next sibling parent Artur Skawina <art.08.09 gmail.com> writes:
On 01/05/12 02:34, Iain Buclaw wrote:
 Code that gdc emits is incompatible with the standard D compiler, if
 that's what you want to call it, and any vendor extensions won't
 contribute to that being more of the case.
 
 Regardless, there is little reason to want to use a forced inline with
 gdc.  Just like in c++ when you define all methods in the class
 definition, gdc considers all methods as candidates for inlining.
 Similarly, when -inline is passed, the same is also done for normal
 functions that are considered inlinable by the frontend.  These
 functions marked as inline are treated in the same way as a function
 declared 'inline' in C or C++, and will be treated as such by the
 backend.
"C" inline is, for historical reasons, ill-defined; i think what people are talking about in the context of D is the equivalent of gcc attribute(always_inline). Ie it's for the cases where not inlining is not an option. Having an explicit C-style "inline" hint is pointless - the compiler should be able to guess this right most of the time. It's for the cases where the programmer already knows the answer and is not willing to let the tool make a mistake. artur
Jan 05 2012
prev sibling next sibling parent reply Manu <turkeyman gmail.com> writes:
On 5 January 2012 03:34, Iain Buclaw <ibuclaw ubuntu.com> wrote:

 Regardless, there is little reason to want to use a forced inline with
 gdc.  Just like in c++ when you define all methods in the class
 definition, gdc considers all methods as candidates for inlining.
 Similarly, when -inline is passed, the same is also done for normal
 functions that are considered inlinable by the frontend.  These
 functions marked as inline are treated in the same way as a function
 declared 'inline' in C or C++, and will be treated as such by the
 backend.
How is this possible, when all functions are virtual, without whole program optimisation?
Jan 05 2012
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/5/2012 3:58 AM, Manu wrote:
 How is this possible, when all functions are virtual, without whole program
 optimisation?
Only public non-final class member functions are virtual. Furthermore, as in C++, the compiler can sometimes determine that a virtual function is being called directly, and will do so (or inline it).
Jan 05 2012
parent reply Manu <turkeyman gmail.com> writes:
On 5 January 2012 23:34, Walter Bright <newshound2 digitalmars.com> wrote:

 On 1/5/2012 3:58 AM, Manu wrote:

 How is this possible, when all functions are virtual, without whole
 program
 optimisation?
Only public non-final class member functions are virtual. Furthermore, as in C++, the compiler can sometimes determine that a virtual function is being called directly, and will do so (or inline it).
Can you define 'sometimes'? I have trouble believing that this will occur very often. The stars aligning to that level of precision seems totally unreliable. Consider the UI code snippet I posted earlier (I've lost the link), most functions are public and not virtual (mostly accessors, or fairly simple mutators), and they could only be identified to be not-overridden with whole program optimisation... The fact you mention potential for inline-ing actually heightens my criticism with another detail I hadn't considered ;) ... Now all my trivial methods won't only be virtual-called, they won't be inlined either! I'm genuinely scared of people forgetting to type final (including myself)... And it's hard as an external coder to go and clean up too. Adding final blocks to someone elses existing code, you don't necessarily know what is truly virtual or not... *mumble mumble*
Jan 05 2012
parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/5/2012 1:58 PM, Manu wrote:
 On 5 January 2012 23:34, Walter Bright <newshound2 digitalmars.com
 <mailto:newshound2 digitalmars.com>> wrote:

     On 1/5/2012 3:58 AM, Manu wrote:

         How is this possible, when all functions are virtual, without whole
program
         optimisation?


     Only public non-final class member functions are virtual. Furthermore, as
in
     C++, the compiler can sometimes determine that a virtual function is being
     called directly, and will do so (or inline it).


 Can you define 'sometimes'?
In C++, it does it if it's a.foo() rather than pa->foo(). In D, it can be done if flow analysis proves that whatever is calling foo() is really an a, and not something derived from a. Also, if you qualify the member call with the class name, it gets called directly, as in a.C.foo().
Jan 05 2012
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
About your junior comment, are virtuals really the biggest thing you
should worry about? There are infinitely many things a newbie
programmer will screw up (think linear algorithms, excessive memory
allocation, hardcoded, non-modular and thread-unsafe code, etc). I
think virtual calls are likely to be just *one* of your problems, and
probably not the biggest one.
Jan 05 2012
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
Btw, I think people are having a misconception on what it means that D
is a systems-programming language. It doesn't mean that D by default
generates the fastest code and trades safety for performance, it means
it *allows* you to write such code. But you need to be aware of what
you're coding.
Jan 05 2012
prev sibling parent reply Manu <turkeyman gmail.com> writes:
On 5 January 2012 15:44, Andrej Mitrovic <andrej.mitrovich gmail.com> wrote:

 About your junior comment, are virtuals really the biggest thing you
 should worry about?
Sure it's not the biggest thing, it's one of numerous things. You'll notice a listed a whole bunch of things in my post, and this isn't my thread, these were in addition to the OP's comments. I'm just trying to add some weight to the OP's sentiments, in that I feel the same way in many areas after a few weeks of experience with D and writing some programs, and considering it for use in future projects.
 There are infinitely many things a newbie
 programmer will screw up (think linear algorithms, excessive memory
 allocation, hardcoded, non-modular and thread-unsafe code, etc). I
 think virtual calls are likely to be just *one* of your problems, and
 probably not the biggest one.
The point is that this is one thing that is completely silently hidden, and the language could fix this tremendously easily by nothing more than a trivial decision of what is default. I realise that's unlikely to happen, this decision is done now, but I think it's important to raise this sort of issue anyway, so that future decisions have more points in the balance. It would also be generally nice if these concerns were acknowledged rather than brushed off. I'm not making problems for the sake of conversation. These are real issues that I encounter in my daily work.
Jan 05 2012
parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/5/2012 6:06 AM, Manu wrote:
 It would also be generally nice if these concerns were acknowledged rather than
 brushed off. I'm not making problems for the sake of conversation. These are
 real issues that I encounter in my daily work.
I do appreciate your effort in making these issues known to us.
Jan 05 2012
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, December 29, 2011 12:23:47 maarten van damme wrote:
 I think it would be an object oriented language, I'm a believer in the
 string theory :)
Well, if you want to discuss string theory... http://xkcd.com/171/ http://xkcd.com/397/ :) - Jonathan M Davis
Jan 02 2012
prev sibling parent "Mattbeui" <matheus_nab hotmail.com> writes:
On Thursday, 29 December 2011 at 09:16:23 UTC, Walter Bright 
wrote:
 http://pastebin.com/AtuzJqh0
I thought this topic was about a mix of Go (Google Language) and D.
Jan 05 2012