www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Opinions: The Best and Worst of D (for a lecture/talk I intend to

reply "Aerolite" <nope nope.com> writes:
Hey all,

I've not posted here in a while, but I've been keeping up to 
speed with D's progress over the last couple of years and remain 
consistently impressed with the language.

I'm part of a new computing society in the University of 
Newcastle, Australia, and am essentially known throughout our 
Computer Science department as 'the D guy'. At the insistence of 
my peers, I have decided to give an introductory lecture on the D 
Programming Language, in order to expose more students to the 
increasingly amazing aspects of D. I expect to cover the good, 
the bad, the awesome, and the ugly, in a 
complement-criticism-complement styled talk, and while I have my 
own opinions regarding each of these things, I'd like a broader 
view from the community regarding these aspects, so that I may 
provide as accurate and as useful information as possible.

So, if you would be so kind, give me a bullet list of the aspects 
of D you believe to be good, awesome, bad, and/or ugly. If you 
have the time, some code examples wouldn't go amiss either! Try 
not to go in-depth to weird edge cases - remain general, yet 
informative. E.g. I consider D's string mixins to be in the 
'awesome' category, but its reliance on the GC for large segments 
of the standard library to be in the 'ugly' category.

Thanks so much for your time!
Jul 07 2014
next sibling parent "H. S. Teoh via Digitalmars-d-learn" <digitalmars-d-learn puremagic.com> writes:
On Mon, Jul 07, 2014 at 11:47:25PM +0000, Aerolite via Digitalmars-d-learn
wrote:
[...]
 So, if you would be so kind, give me a bullet list of the aspects of D
 you believe to be good, awesome, bad, and/or ugly. If you have the
 time, some code examples wouldn't go amiss either! Try not to go
 in-depth to weird edge cases - remain general, yet informative. E.g. I
 consider D's string mixins to be in the 'awesome' category, but its
 reliance on the GC for large segments of the standard library to be in
 the 'ugly' category.
[...] String mixins are a controversial feature. Many (including myself) think they are definitely awesome, but that power also comes with the price of being harder to maintain, and difficult to integrate with IDE features (though the latter doesn't matter to me 'cos I don't use IDEs). Because of that, some others think they are a misfeature, and have proposed to remove them. But I don't think it's going away anytime soon, since several key features depend on it. Perhaps I may bring up a representative use case: operator overloading. In C++, if you implement a custom number type, for example, you have to overload operator+, operator*, operator/, operator-. And then you realize you left out operator+=, operator*=, operator/=, ... and then you need operator<, operator>, operator<=, operator>=, ad nauseum. In D? // This covers +, -, *, /, +=, -=, *=, /=, etc.. auto opBinary(string op)(NumType n) { return NumType(mixin(this.val ~ op ~ n.val)); } // And this covers all the comparison operators: int opCmp(NumType n) { return ... /* implementation here */ } Without string mixins, you'd have to copy-n-paste a ton of boilerplate to get the necessary operator overloadings. // About the GC, my opinion is that anti-GC sentiment is largely unfounded. The GC greatly simplifies certain tasks (string manipulation, for one thing, and returning recursive data structures like trees). There *are* cases where the GC can cause trouble, like in game engines, esp. given that the GC implementation in D leaves a lot of room for improvement, but a lot of C/C++ programmers have knee-jerk reactions about GCs for no other reason than unfamiliarity breeding contempt (I speak for myself, since I come from a strong C/C++ background, and had the same reaction), than any truly objective measurements. I'd say a large percentage of applications don't need to manually micro-manage memory, and having a GC eliminates an entire class of bugs and greatly improves productivity. // But anyway, not to dwell on a contentious issue, let's move on to something oft overlooked (and sometimes even maligned): unittests. I could sing praises of D's built-in unittest blocks all day. When I was young and foolish, I prided myself on being the macho C/C++ programmer whose code will work the first time after I write it. Who needs tests? They are for the weak-minded, who cannot grasp their code in every last detail of their programs in their head with absolute clarity. Real Programmers can write code in their dreams, and it will work the first time they run it. Except that they don't. :P When I first started learning D, I tried my best to ignore unittests, telling myself that it's for weaker programmers, but they just kept sitting their, right in the language, and staring at me, all cute-like, until I was shamed into actually writing a few of them just to quiet that nagging voice within. And lo and behold, I started finding bugs in my "perfect" code -- corner cases I missed (even though I painstakingly took care of all the *other* corner cases), subtle typos that compiled but gave the wrong answers, blatant errors that I missed due to obsession over tiny details, etc.. To my chagrin, I discovered that I was *not* the macho perfect programmer that I imagined, and that these unittests for the weak were singlehandedly improving the quality of my code by leaps and bounds. Sometimes, the mere act of writing unittests would bring my attention to a corner case that I'd missed, and I'd fix the code, long before my past self would've noticed it (by finding it as a runtime bug much later). When I revised the code later, I suddenly could have peace of mind that if the unittests didn't break, then in all likelihood the new code is still correct (or mostly so). Regressions were instantly noticed, rather than weeks or months down the road when I suddenly needed a corner case that I knew was previously working but had since been broken by a later revision. Unittests in the language == total win. External unittest frameworks may be more powerful, more flexible, etc., but I'd never use them, 'cos they are too troublesome. I have to switch out of coding mode, open a new file, and possibly shift gears to a different language, write the test, then switch back, and later on it becomes too trouble to keep switching back and forth (and maintaining the unittests) so I'd just not bother running them, and when I do run them, I'd regard them with loathing that they failed Yet Again!, and so I'd hesitate adding any more tests. D's built-in unittests? Total win. You can call other functions, submodules, etc., from the unittest, which makes it much, much easier to setup scaffolding for testing. Trying to do that in a dedicated testing language is an exercise in pain tolerance. // And it's getting late, and this email is getting way too long, but I can't help bringing up metaprogramming. That's probably the one thing that singlehandedly converted me to D. If you got your impression of metaprogramming from that nasty rats'-nest known as C++ templates, then you have my deepest sympathies, but that is *not* representative of what metaprogramming can be. D's templates, which are perhaps better regarded as compile-time parameters, plus compile-time introspection and CTFE, launch D into new heights of expressiveness and compile-time power. Regretfully I need to go soon, but I would point out Dmitry Olshansky's std.regex.ctRegex as an example of metaprogramming that allows you to write regexes in comfortable, natural syntax (unlike C++'s monstrous *cough*Xpressive*ahem*), yet with full compile-time optimization that outperforms basically every other regex library out there. No small feat, one has to admit! // Alright. The warts. The bad. There are a few, unfortunately. - The current AA implementation leaves a lot to be desired... It has been improving, and hope is bright, but it's still rather messy, and interacts poorly with other built-in features, like const / immutable, dtors, disabled ctors, etc.. Use string keys, and you're mostly OK. But once const starts getting in there, or dtors, and things quickly get ugly, real fast. Strange corner cases, weird bugs, and other pathological implementation holes start showing up. - Dtors, in general, are prone to nastiness. (One could argue this is where having a GC stinks.) Basic usage, for the most part, is OK. Start mixing it with, say, closures, containers, disabled, etc., and you quickly run into a minefield of pitfalls that will, on the positive side, *quickly* educate you on the arcane deep aspects of D, but on the negative side, will also have you pulling out all your hair in short order. - Using array literals with enums (i.e., as a manifest constant) is a Very Nasty Very Bad thing to do. It will silently allocate a new array every single time you reference that enum, thereby destroying the performance benefits of a compiled language very quickly. Having this as default behaviour is a big design mistake IMO, and doesn't jive with the rest of the language where the default behaviour is the sane behaviour, and to get weird behaviour you have to explicitly ask for it. - safe still isn't quite usable yet. It's there, and many things support it, but there are still fundamental features (including library features) that aren't safe-compatible yet, which makes it unusable for me. Not to mention there are currently holes in the safe system. - Contracts (DbC) are a good idea, but the current implementation is disappointing. There are still some glaring holes in it that may or may not be easy to fix. As things stand, it still has a ways to go. - ... OK, there's a lot more stuff I want to write but it's late and I have to go now, so maybe I'll chime in again later. Maybe. T -- Food and laptops don't mix.
Jul 07 2014
prev sibling next sibling parent Philippe Sigaud via Digitalmars-d-learn writes:
On Tue, Jul 8, 2014 at 7:50 AM, H. S. Teoh via Digitalmars-d-learn
<digitalmars-d-learn puremagic.com> wrote <quite a wall of text>

Wow, what to add to that? Maybe you scared other from participating ;-)

* I'll second metaprogramming: the alliance between templates, CTFE
and mixins is really nice. It's *the* feature (or triplet of features)
I think of when Walter says that many D parts are kind of "what for?"
in isolation but really grow into something awesome by using one
another.

* I'd add static introspection to the mix: using static if,
__traits(...) and is(...), clunky as the syntax is (there, one 'ugly'
thing for you), is easy and very powerful: the idea to have code
selecting its flow or extracting information (is that a static array?
Does this aggregate have an '.empty' method?). This is the basis for
std.algorithm idiom of 'static interface' which allows us compile-time
duck typing, which I find very pleasing.

* unittests are nice of course, H. S. Teoh already said it
expressively enough :-)

* I'd add arrays and slice. They are wonderfully simple to use,
efficient, etc. I remember learning D by translating the Euler Project
problems from C++ to D and it was a joy.

Which brings me to another feature I like: ranges. The idea is nothing
new, but I find it quite well-realized in D, far easier than other
compiled languages alternatives and since they are pervasive in the
library, you can really plug components into one another nicely.

For example:
http://wiki.dlang.org/Component_programming_with_ranges#Case_Study:_Formatting_a_Calendar

* dub is good, and you can show code.dlang.org in your presentation.

* Bonus point: different compilers. I like that. I regularly use at
least two in parallel while developping (comparing results, speed,
etc). Kudos to all the guys involved!



As for the bad and ugly:

* it's frustrating not to have a big collection module in Phobos, but
then I didn't propose one, so I'll shut up.
* there are still some not so nice interaction between
const/shared/inout/ auto ref, though I rarely hit them these days
* The compiler still has some quirks: I find it strange you can put
unused qualifiers in many places.

But very honestly, it was already a joy to use a few years ago, and
it's becoming better and better.
Jul 08 2014
prev sibling next sibling parent "H. S. Teoh via Digitalmars-d-learn" <digitalmars-d-learn puremagic.com> writes:
On Wed, Jul 09, 2014 at 07:51:24AM +0200, Philippe Sigaud via
Digitalmars-d-learn wrote:
 On Tue, Jul 8, 2014 at 7:50 AM, H. S. Teoh via Digitalmars-d-learn
 <digitalmars-d-learn puremagic.com> wrote <quite a wall of text>
 
 Wow, what to add to that? Maybe you scared other from participating
 ;-)
I hope not. :) [...]
 * I'd add static introspection to the mix: using static if,
 __traits(...) and is(...), clunky as the syntax is (there, one 'ugly'
 thing for you), is easy and very powerful:
[...] Oh yeah, I forgot about that one. The syntax of is-expressions is very counterintuitive (not to mention noisy), and has too many special-cased meanings that are completely non-obvious for the uninitiated, for example: // Assume T = some type is(T) // is T a valid type? is(T U) // is T a valid type? If so, alias it to U is(T : U) // is T implicitly convertible to U? is(T U : V) // is T implicitly convertible to V? If so, // alias it to U is(T U : V, W) // does T match the type pattern V, for some // template arguments W? If so, alias to U is(T == U) // is T the same type as U? // You thought the above is (somewhat) consistent? Well look at // this one: is(T U : __parameters) // is T the type of a function? If so, alias U // to the parameter tuple of its arguments. That last one is remarkably pathological: it breaks away from the general interpretation of the other cases, where T is matched against the right side of the expression; here, __parameters is a magic keyword that makes the whole thing mean something else completely. Not to mention, what is "returned" in U is something extremely strange; it looks like a "type tuple", but it's actually something more than that. Unlike usual "type tuples", in addition to encoding the list of types of the function's parameters, it also includes the parameter names and attributes... except that you can only get at the parameter names using __traits(name,...). But indexing it like a regular "type tuple" will reduce its elements into mere types, on which __traits(name,...) will fail; you need to take 1-element slices of it in order to preserve the additional information. This strange, inconsistent behaviour only barely begins to make sense once you understand how it's implemented in the compiler. It's the epitome of leaky abstraction. T -- Do not reason with the unreasonable; you lose by definition.
Jul 09 2014
prev sibling next sibling parent reply "Meta" <jared771 gmail.com> writes:
On Monday, 7 July 2014 at 23:47:26 UTC, Aerolite wrote:
 Hey all,

 I've not posted here in a while, but I've been keeping up to 
 speed with D's progress over the last couple of years and 
 remain consistently impressed with the language.

 I'm part of a new computing society in the University of 
 Newcastle, Australia, and am essentially known throughout our 
 Computer Science department as 'the D guy'. At the insistence 
 of my peers, I have decided to give an introductory lecture on 
 the D Programming Language, in order to expose more students to 
 the increasingly amazing aspects of D. I expect to cover the 
 good, the bad, the awesome, and the ugly, in a 
 complement-criticism-complement styled talk, and while I have 
 my own opinions regarding each of these things, I'd like a 
 broader view from the community regarding these aspects, so 
 that I may provide as accurate and as useful information as 
 possible.

 So, if you would be so kind, give me a bullet list of the 
 aspects of D you believe to be good, awesome, bad, and/or ugly. 
 If you have the time, some code examples wouldn't go amiss 
 either! Try not to go in-depth to weird edge cases - remain 
 general, yet informative. E.g. I consider D's string mixins to 
 be in the 'awesome' category, but its reliance on the GC for 
 large segments of the standard library to be in the 'ugly' 
 category.

 Thanks so much for your time!
opDispatch is a mostly untapped goldmine of potential. Just take a look at this thread, where an (almost, depends on the compiler) no-cost safe dereference wrapper was implemented using it: http://forum.dlang.org/post/mailman.2584.1403213951.2907.digitalmars-d puremagic.com opDisptach also allows for vector swizzling, which is really nice for any kind of vector work. One of the uglier things in D is also a long-standing problem with C and C++, in that comparison of signed and unsigned values is allowed.
Jul 09 2014
parent reply "Dominikus Dittes Scherkl" writes:
On Wednesday, 9 July 2014 at 14:51:41 UTC, Meta wrote:
 One of the uglier things in D is also a long-standing problem 
 with C and C++, in that comparison of signed and unsigned 
 values is allowed.
I would like that, if it would be implemented along this line: /// Returns -1 if a < b, 0 if they are equal or 1 if a > b. /// this will always yield a correct result, no matter which numeric types are compared. /// It uses one extra comparison operation if and only if /// one type is signed and the other unsigned but the signed value is >= 0 /// (that is what you need to pay for stupid choice of type). int opCmp(T, U)(const(T) a, const(U) b) primitive if(isNumeric!T && isNumeric!U) { static if(Unqual!T == Unqual!U) { // use the standard D implementation } else static if(isFloatingPoint!T || isFloatingPoint!U) { alias CommonType!(T, U) C; return opCmp!(cast(C)a, cast(C)b); } else static if(isSigned!T && isUnsigned!U) { alias CommonType!(Unsigned!T, U) C; return (a < 0) ? -1 : opCmp!(cast(C)a, cast(C)b); } else static if(isUnsigned!T && isSigned!U) { alias CommonType!(T, Unsigned!U) C; return (b < 0) ? 1 : opCmp!(cast(C)a, cast(C)b); } else // both signed or both unsigned { alias CommonType!(T, U) C; return opCmp!(cast(C)a, cast(C)b); } }
Jul 09 2014
next sibling parent "Dominikus Dittes Scherkl" writes:
Of course without the ! after opCmp in the several cases.
Jul 09 2014
prev sibling parent reply "H. S. Teoh via Digitalmars-d-learn" <digitalmars-d-learn puremagic.com> writes:
On Wed, Jul 09, 2014 at 04:24:38PM +0000, Dominikus Dittes Scherkl via
Digitalmars-d-learn wrote:
 On Wednesday, 9 July 2014 at 14:51:41 UTC, Meta wrote:
One of the uglier things in D is also a long-standing problem with C
and C++, in that comparison of signed and unsigned values is allowed.
I would like that, if it would be implemented along this line: /// Returns -1 if a < b, 0 if they are equal or 1 if a > b. /// this will always yield a correct result, no matter which numeric types are compared. /// It uses one extra comparison operation if and only if /// one type is signed and the other unsigned but the signed value is >= 0 /// (that is what you need to pay for stupid choice of type).
[...] Yeah, I don't see what's the problem with comparing signed and unsigned values, as long as the result is as expected. Currently, however, this code asserts, which is wrong: uint x = uint.max; int y = -1; assert(x > y);
 {
    static if(Unqual!T == Unqual!U)
Nitpick: should be: static if(is(Unqual!T == Unqual!U)) [...]
    else static if(isSigned!T && isUnsigned!U)
    {
       alias CommonType!(Unsigned!T, U) C;
       return (a < 0) ? -1 : opCmp!(cast(C)a, cast(C)b);
    }
    else static if(isUnsigned!T && isSigned!U)
    {
       alias CommonType!(T, Unsigned!U) C;
       return (b < 0) ? 1 : opCmp!(cast(C)a, cast(C)b);
    }
[...] Hmm. I wonder if there's a more efficient way to do this. For the comparison s < u, where s is a signed value and u is an unsigned value, whenever s is negative, the return value of opCmp must be negative. Assuming 2's-complement representation of integers, this means we simply copy the MSB of s (i.e., the sign bit) to the result. So we can implement s < u as: enum signbitMask = 1u << (s.sizeof*8 - 1); // this is a compile-time constant return (s - u) | (s & signbitMask); // look ma, no branches! which would translate (roughly) to the assembly code: mov eax, [<address of s>] mov ebx, [<address of u>] mov ecx, eax ; save the value of s for signbit extraction sub eax, ebx ; s - u or eax, ecx ; (s - u) | (s & signbitMask) (ret ; this is deleted if opCmp is inlined) which avoids a branch hazard in the CPU pipeline. Similarly, for the comparison u < s, whenever s is negative, then opCmp must always be positive. So this means we copy over the *negation* of the sign bit of s to the result. So we get this for u < s: enum signbitMask = 1u << (s.sizeof*8 - 1); // as before return (u - s) & ~(s & signbitMask); // look ma, no branches! which translates roughly to: mov eax, [<address of u>] mov ebx, [<address of s>] sub eax, ebx ; u - s not ebx ; ~(s & signbitMask) and eax, ebx ; (u - s) & ~(s & signbitMask) (ret ; this is deleted if opCmp is inlined) Again, this avoid a branch hazard in the CPU pipeline. In both cases, the first 2 instructions are unnecessary if the values to be compared are already in CPU registers. The naïve implementation of opCmp is just a single sub instruction (this is why opCmp is defined the way it is, BTW), whereas the "smart" signed/unsigned comparison is 4 instructions long. The branched version would look something like this: mov eax, [<address of u>] mov ebx, [<address of s>] jge label1 ; first branch mov eax, $#FFFFFFFF jmp label2 ; 2nd branch label1: sub eax, ebx label2: (ret) The 2nd branch can be replaced with ret if opCmp is not inlined, but requiring a function call to compare integers seems excessive, so let's assume it's inlined, in which case the 2nd branch is necessary. So as you can see, the branched version is 5 instructions long, and always causes a CPU pipeline hazard. So I submit that the unbranched version is better. ;-) (So much for premature optimization... now lemme go and actually benchmark this stuff and see how well it actually performs in practice. Often, such kinds of hacks often perform more poorly than expected due to unforeseen complications with today's complex CPU's. So for all I know, I could've just been spouting nonsense above. :P) T -- Debian GNU/Linux: Cray on your desktop.
Jul 09 2014
next sibling parent reply "Dominikus Dittes Scherkl" writes:
On Wednesday, 9 July 2014 at 17:13:21 UTC, H. S. Teoh via 
Digitalmars-d-learn wrote:
 On Wed, Jul 09, 2014 at 04:24:38PM +0000, Dominikus Dittes 
 Scherkl via Digitalmars-d-learn wrote:
 /// Returns -1 if a < b, 0 if they are equal or 1 if a > b.
 /// this will always yield a correct result, no matter which 
 numeric types are compared.
 /// It uses one extra comparison operation if and only if
 /// one type is signed and the other unsigned but the signed 
 value is >= 0
 /// (that is what you need to pay for stupid choice of type).
[...] Yeah, I don't see what's the problem with comparing signed and unsigned values, as long as the result is as expected. Currently, however, this code asserts, which is wrong: uint x = uint.max; int y = -1; assert(x > y);
Yes, this is really bad. But last time I got the response that this is so to be compatible with C. That is what I really thought was the reason why D throw away balast from C, to fix bugs.
    static if(Unqual!T == Unqual!U)
Nitpick: should be: static if(is(Unqual!T == Unqual!U))
Of course.
 [...]
    else static if(isSigned!T && isUnsigned!U)
    {
       alias CommonType!(Unsigned!T, U) C;
       return (a < 0) ? -1 : opCmp!(cast(C)a, cast(C)b);
    }
    else static if(isUnsigned!T && isSigned!U)
    {
       alias CommonType!(T, Unsigned!U) C;
       return (b < 0) ? 1 : opCmp!(cast(C)a, cast(C)b);
    }
[...] Hmm. I wonder if there's a more efficient way to do this.
I'm sure. But I think it should be done at the compiler, not in a library.
 {...]
 opCmp is just a single sub instruction (this is why opCmp is 
 defined the
 way it is, BTW), whereas the "smart" signed/unsigned comparison 
 is 4
 instructions long.
 [...]
 you can see, the branched version is 5 instructions long, and 
 always
 causes a CPU pipeline hazard.

 So I submit that the unbranched version is better. ;-)
I don't think so, because the branch will only be taken if the signed type is >= 0 (in fact unsigned). So if the signed/unsigned comparison is by accident, you pay the extra runtime. But if it is intentional the signed value is likely to be negative, so you get a correct result with no extra cost. Even better for constants, where the compiler can not only evaluate expressions like (uint.max > -1) correct, but it should optimize them completely away!
 (So much for premature optimization... now lemme go and actually
 benchmark this stuff and see how well it actually performs in 
 practice.
Yes, we should do this.
 Often, such kinds of hacks often perform more poorly than 
 expected due
 to unforeseen complications with today's complex CPU's. So for 
 all I
 know, I could've just been spouting nonsense above. :P)
I don't see such a compiler change as a hack. It is a strong improvement IMHO.
Jul 09 2014
next sibling parent "H. S. Teoh via Digitalmars-d-learn" <digitalmars-d-learn puremagic.com> writes:
On Wed, Jul 09, 2014 at 05:43:15PM +0000, Dominikus Dittes Scherkl via
Digitalmars-d-learn wrote:
 On Wednesday, 9 July 2014 at 17:13:21 UTC, H. S. Teoh via
 Digitalmars-d-learn wrote:
[..]
Yeah, I don't see what's the problem with comparing signed and unsigned
values, as long as the result is as expected. Currently, however, this
code asserts, which is wrong:

	uint x = uint.max;
	int y = -1;
	assert(x > y);
Yes, this is really bad. But last time I got the response that this is so to be compatible with C. That is what I really thought was the reason why D throw away balast from C, to fix bugs.
I think the slogan was that if something in D looks like C, then it should either have C semantics or not compile. According to this logic, the only recourse here is to prohibit comparison of signed with unsigned, which I don't think is workable because there are many valid use cases for it (plus, it will break a ton of code and people will be very unhappy). I don't like the current behaviour, though. It just reeks of wrong in so many different ways. If you *really* want semantives like the above, you really should be casting y to unsigned so that it's clear what exactly you're trying to achieve. [...]
Hmm. I wonder if there's a more efficient way to do this.
I'm sure. But I think it should be done at the compiler, not in a library.
Obviously, yes. But I wasn't thinking about implementing opCmp in the library -- that would be strange since ints, of all things, need to have native compiler support. I was thinking more about how the compiler would implement "safe" signed/unsigned comparisons. [...]
So I submit that the unbranched version is better. ;-)
I don't think so, because the branch will only be taken if the signed type is >= 0 (in fact unsigned). So if the signed/unsigned comparison is by accident, you pay the extra runtime. But if it is intentional the signed value is likely to be negative, so you get a correct result with no extra cost.
Good point. Moreover, I have discovered multiple bugs in my proposed implementation; the correct implementation should be as follows: int compare(int x, uint y) { enum signbitMask = 1u << (int.sizeof*8 - 1); static assert(signbitMask == 0x80000000); // The (x|y) & signbitMask basically means that if either x is negative // or y > int.max, then the result will always be negative (sign bit // set). return (cast(uint)x - y) | ((x | y) & signbitMask); } unittest { // Basic cases assert(compare(5, 10u) < 0); assert(compare(5, 5u) == 0); assert(compare(10, 5u) > 0); // Large cases assert(compare(0, uint.max) < 0); assert(compare(50, uint.max) < 0); // Sign-dependent cases assert(compare(-1, 0u) < 0); assert(compare(-1, 10u) < 0); assert(compare(-1, uint.max) < 0); } int compare(uint x, int y) { enum signbitMask = 1u << (int.sizeof*8 - 1); static assert(signbitMask == 0x80000000); return ((x - y) & ~(x & signbitMask)) | ((cast(uint)y & signbitMask) >> 1); } unittest { // Basic cases assert(compare(0u, 10) < 0); assert(compare(10u, 10) == 0); assert(compare(10u, 5) > 0); // Large cases assert(compare(uint.max, 10) > 0); assert(compare(uint.max, -10) > 0); // Sign-dependent cases assert(compare(0u, -1) > 0); assert(compare(10u, -1) > 0); assert(compare(uint.max, -1) > 0); } Using gdc -O3, I managed to get a very good result for compare(int,uint), only 5 instructions long. However, for compare(uint,int), there is the annoying special case of compare(uint.max, -1), which can only be fixed by the hack ... | ((y & signbitMask) >> 1). Unfortunately, this makes it 11 instructions long, which is unacceptable. So it looks like a simple compare and branch would be far better in the compare(uint,int) case -- it's far more costly to avoid the branch than to live with it.
 Even better for constants, where the compiler can not only evaluate
 expressions like (uint.max > -1) correct, but it should optimize them
 completely away!
Actually, with gdc -O3, I found that the body of the above unittests got completely optimized away at compile-time, so that the unittest body is empty in the executable! So even with a library implementation the compiler was able to maximize performance. DMD left the assert calls in, but then it's not exactly known for generating optimal code anyway.
(So much for premature optimization... now lemme go and actually
benchmark this stuff and see how well it actually performs in
practice.
Yes, we should do this.
Often, such kinds of hacks often perform more poorly than expected
due to unforeseen complications with today's complex CPU's. So for
all I know, I could've just been spouting nonsense above. :P)
I don't see such a compiler change as a hack. It is a strong improvement IMHO.
I was talking about using | and & to get rid of the branch in signed/unsigned comparison. As it turns out, the compare(uint,int) case seems far more costly than a simple compare-and-branch as you had it at the beginning. So at least that part of what I wrote is probably nonsense. :P But I can't say for sure until I actually run some benchmarks on it. T -- "The number you have dialed is imaginary. Please rotate your phone 90 degrees and try again."
Jul 09 2014
prev sibling next sibling parent reply "H. S. Teoh via Digitalmars-d-learn" <digitalmars-d-learn puremagic.com> writes:
On Wed, Jul 09, 2014 at 11:29:06AM -0700, H. S. Teoh via Digitalmars-d-learn
wrote:
 On Wed, Jul 09, 2014 at 05:43:15PM +0000, Dominikus Dittes Scherkl via
Digitalmars-d-learn wrote:
 On Wednesday, 9 July 2014 at 17:13:21 UTC, H. S. Teoh via
 Digitalmars-d-learn wrote:
[...]
Often, such kinds of hacks often perform more poorly than expected
due to unforeseen complications with today's complex CPU's. So for
all I know, I could've just been spouting nonsense above. :P)
I don't see such a compiler change as a hack. It is a strong improvement IMHO.
I was talking about using | and & to get rid of the branch in signed/unsigned comparison. As it turns out, the compare(uint,int) case seems far more costly than a simple compare-and-branch as you had it at the beginning. So at least that part of what I wrote is probably nonsense. :P But I can't say for sure until I actually run some benchmarks on it.
[...] Hmph. I'm having trouble coming up with a fair benchmark, because I realized that D doesn't actually have a way of expressing opCmp for unsigned int's in a minimal way! The problem is that the function needs to return int, but given two uints, their difference may be greater than int.max, so simply subtracting them will not work. So the best I can come up with is: int compare2(int x, uint y) { return (x < 0) ? -1 : (y > int.max) ? -1 : (x - y); } which requires 2 comparisons. Similarly, for the uint-int case: int compare2(uint x, int y) { return (y < 0) ? 1 : (x > int.max) ? 1 : (x - y); } If you have a better implementation in mind, I'm all ears. In any case, I went ahead and benchmarked the above two functions along with my branchless implementations, and here are the results: (with dmd -O -unittest:) non-branching compare(signed,unsigned): 5513 msecs branching compare(signed,unsigned): 5442 msecs non-branching compare(unsigned,signed): 5441 msecs branching compare(unsigned,signed): 5744 msecs Optimizer-thwarting value: 0 (with gdc -O3 -funittest:) non-branching compare(signed,unsigned): 516 msecs branching compare(signed,unsigned): 1209 msecs non-branching compare(unsigned,signed): 453 msecs branching compare(unsigned,signed): 756 msecs Optimizer-thwarting value: 0 (Ignore the last lines of each output; that's just a way to prevent gdc -O3 from being over-eager and optimizing out the entire test so that everything returns 0 msecs.) Interestingly, with dmd, the branching compare for the signed-unsigned case is faster than my non-branching one, but the order is reversed for the unsigned-signed case. They're pretty close, though, and on some runs the order of the latter case is reversed. With gdc, however, it seem the non-branching versions are clearly better, even in the unsigned-signed case, which I thought would be inferior. So clearly, these results are very optimizer-dependent. Keep in mind, though, that this may not necessarily reflect actual performance when the compiler generates the equivalent code for the built-in integer comparison operators, because in codegen the compiler can take advantage of the CPU's carry and overflow bits, and can elide the actual return values of opCmp. This may skew the results enough to reverse the order of some of these cases. Anyway, here's the code (for independent verification): int compare(int x, uint y) { enum signbitMask = 1u << (int.sizeof*8 - 1); static assert(signbitMask == 0x80000000); // The (x|y) & signbitMask basically means that if either x is negative // or y > int.max, then the result will always be negative (sign bit // set). return (cast(uint)x - y) | ((x | y) & signbitMask); } unittest { // Basic cases assert(compare(5, 10u) < 0); assert(compare(5, 5u) == 0); assert(compare(10, 5u) > 0); // Large cases assert(compare(0, uint.max) < 0); assert(compare(50, uint.max) < 0); // Sign-dependent cases assert(compare(-1, 0u) < 0); assert(compare(-1, 10u) < 0); assert(compare(-1, uint.max) < 0); } int compare2(int x, uint y) { return (x < 0) ? -1 : (y > int.max) ? -1 : x - y; } unittest { // Basic cases assert(compare2(5, 10u) < 0); assert(compare2(5, 5u) == 0); assert(compare2(10, 5u) > 0); // Large cases assert(compare2(0, uint.max) < 0); assert(compare2(50, uint.max) < 0); // Sign-dependent cases assert(compare2(-1, 0u) < 0); assert(compare2(-1, 10u) < 0); assert(compare2(-1, uint.max) < 0); } int compare(uint x, int y) { enum signbitMask = 1u << (int.sizeof*8 - 1); static assert(signbitMask == 0x80000000); return ((x - y) & ~(x & signbitMask)) | ((cast(uint)y & signbitMask) >> 1); } unittest { // Basic cases assert(compare(0u, 10) < 0); assert(compare(10u, 10) == 0); assert(compare(10u, 5) > 0); // Large cases assert(compare(uint.max, 10) > 0); assert(compare(uint.max, -10) > 0); // Sign-dependent cases assert(compare(0u, -1) > 0); assert(compare(10u, -1) > 0); assert(compare(uint.max, -1) > 0); } int compare2(uint x, int y) { return (y < 0) ? 1 : (x > int.max) ? 1 : x - y; } unittest { // Basic cases assert(compare2(0u, 10) < 0); assert(compare2(10u, 10) == 0); assert(compare2(10u, 5) > 0); // Large cases assert(compare2(uint.max, 10) > 0); assert(compare2(uint.max, -10) > 0); // Sign-dependent cases assert(compare2(0u, -1) > 0); assert(compare2(10u, -1) > 0); assert(compare2(uint.max, -1) > 0); } void main() { import std.datetime; import std.stdio : writefln; enum numRepeat = 5; int dummy; // deoptimizer auto nobranch_su = numRepeat.benchmark!(() { foreach (s; -10_000 .. 10_000) { foreach (uint u; 0 .. 20_000) dummy ^= compare(s, u); } }); writefln("non-branching compare(signed,unsigned): %d msecs", nobranch_su[0].msecs); auto branch_su = numRepeat.benchmark!(() { foreach (s; -10_000 .. 10_000) { foreach (uint u; 0 .. 20_000) dummy ^= compare2(s, u); } }); writefln("branching compare(signed,unsigned): %d msecs", branch_su[0].msecs); auto nobranch_us = numRepeat.benchmark!(() { foreach (s; -10_000 .. 10_000) { foreach (uint u; 0 .. 20_000) dummy ^= compare(u, s); } }); writefln("non-branching compare(unsigned,signed): %d msecs", nobranch_us[0].msecs); auto branch_us = numRepeat.benchmark!(() { foreach (s; -10_000 .. 10_000) { foreach (uint u; 0 .. 20_000) dummy ^= compare2(u, s); } }); writefln("branching compare(unsigned,signed): %d msecs", branch_us[0].msecs); writefln("Optimizer-thwarting value: %d", dummy); } T -- The best way to destroy a cause is to defend it poorly.
Jul 09 2014
parent reply "Dominikus Dittes Scherkl" writes:
On Wednesday, 9 July 2014 at 19:54:47 UTC, H. S. Teoh via 
Digitalmars-d-learn wrote:
 [...]
 The problem is that the function needs to return
 int, but given two uints, their difference may be
 greater than int.max, so simply subtracting them
 will not work. So the best I can come up with is:

 	int compare2(int x, uint y)
 	{
 		return (x < 0) ? -1 :
 			(y > int.max) ? -1 :
 			(x - y);
 	}

 which requires 2 comparisons.
Hmm. Diff works for compare(int,int). So how about this: int compare2(int x, uint y) { return (y > int.max) ? -1 : (x - cast(int)y); } int compare2(uint x, int y) { return (x > int.max) ? -1 : (cast(int)x - y); }
Jul 10 2014
parent reply "Dominikus Dittes Scherkl" writes:
Should of course be:

int compare2(uint x, int y)
{
    return (x > int.max) ? 1 : (cast(int)x - y);
}
Jul 10 2014
parent "Dominikus Dittes Scherkl" writes:
So at all the implementation will look something like this:

int opCmp(T, U)(const(T) a, const(U) b)  primitive 
if(isIntegral!T && isIntegral!U)
{
    alias CommonType!(Signed!T, Signed!U) C;

    static if(isSigned!T && isUnsigned!U)
    {
       return (b > cast(Unsigned!C)C.max) ? -1 : cast(C)a - 
cast(C)b;
    }
    else static if(isUnsigned!T && isSigned!U)
    {
       return (a > cast(Unsigned!C)C.max) ? 1 : cast(C)a - 
cast(C)b;
    }
    else // both signed or both unsigned
    {
       return cast(C)a - cast(C)b;
    }
}

And it will be just as fast as ever, except if you compare apples 
with peaches where it take a tick longer but give the correct 
result anyway
Jul 10 2014
prev sibling parent "H. S. Teoh via Digitalmars-d-learn" <digitalmars-d-learn puremagic.com> writes:
On Wed, Jul 09, 2014 at 12:53:08PM -0700, H. S. Teoh via Digitalmars-d-learn
wrote:
[...]
 (with gdc -O3 -funittest:)
 
 	non-branching compare(signed,unsigned): 516 msecs
 	branching compare(signed,unsigned): 1209 msecs
 	non-branching compare(unsigned,signed): 453 msecs
 	branching compare(unsigned,signed): 756 msecs
 	Optimizer-thwarting value: 0
 
 (Ignore the last lines of each output; that's just a way to prevent gdc
 -O3 from being over-eager and optimizing out the entire test so that
 everything returns 0 msecs.)
[...] Argh. I just looked at the disassembly, and unfortunately, we have to discard the test results for gdc, because gdc -O3 has apparently turned on auto-*vectorising* optimizations, so the reason the non-branching implementation runs so fast, is because multiple calls are being run in parallel in the xmm* registers! While this is certainly an impressive feat for gdc's optimizer, it unfortunately also means the above benchmark doesn't reflect the actual performance of standalone int/uint comparisons. :-( T -- I see that you JS got Bach.
Jul 09 2014
prev sibling next sibling parent "anonymous" <anonymous example.com> writes:
On Wednesday, 9 July 2014 at 17:13:21 UTC, H. S. Teoh via
Digitalmars-d-learn wrote:
 For the comparison s < u, where s is a signed value and u is an 
 unsigned
 value, whenever s is negative, the return value of opCmp must be
 negative.  Assuming 2's-complement representation of integers, 
 this
 means we simply copy the MSB of s (i.e., the sign bit) to the 
 result. So
 we can implement s < u as:

 	enum signbitMask = 1u << (s.sizeof*8 - 1); // this is a 
 compile-time constant
 	return (s - u) | (s & signbitMask); // look ma, no branches!
This is a problem, isn't it: void main() { assert(cmp(0, uint.max) < 0); /* fails */ } int cmp(int s, uint u) { enum signbitMask = 1u << (s.sizeof*8 - 1); // this is a compile-time constant return (s - u) | (s & signbitMask); // look ma, no branches! }
Jul 09 2014
prev sibling parent "Dominikus Dittes Scherkl" writes:
On Wednesday, 9 July 2014 at 17:13:21 UTC, H. S. Teoh via 
Digitalmars-d-learn wrote:
 The branched version would look something like this:

 	mov	eax, [<address of u>]
 	mov	ebx, [<address of s>]

 	jge	label1		; first branch
 	mov	eax, $#FFFFFFFF
 	jmp	label2		; 2nd branch
 label1:
 	sub	eax, ebx
 label2:
 	(ret)
Why? I would say: mov eax, [<adress of s>] ; mov directly compares to zero jl lable ; less -> jump to return sub eax, [<adress of u>] neg eax ; because we subtracted in the wrong order lable: ret
Jul 09 2014
prev sibling parent "JR" <zorael gmail.com> writes:
On Monday, 7 July 2014 at 23:47:26 UTC, Aerolite wrote:
 So, if you would be so kind, give me a bullet list of the 
 aspects of D you believe to be good, awesome, bad, and/or ugly. 
 If you have the time, some code examples wouldn't go amiss 
 either! Try not to go in-depth to weird edge cases - remain 
 general, yet informative. E.g. I consider D's string mixins to 
 be in the 'awesome' category, but its reliance on the GC for 
 large segments of the standard library to be in the 'ugly' 
 category.
I'm a big fan of (templated) UFCS. Along with parameterless function calls it goes a long way toward improving readability with its pipe-like flow. And like with all templates, allowing for breaking out common functionality *without* resorting to inheritance. Templates overall are sexy. void main() { import std.stdio; import std.conv; import std.string; import std.range; import core.thread; // values known at compile-time --> CTFE kicks in~ static assert(12345.to!string == "12345"); immutable period = 1.seconds + 100.msecs + 100.usecs; writeln(period); Thread.sleep(period); immutable line = "fedcba" .retro .to!string .toUpper; // wish this worked: typeof(line).is!string; static assert(is(typeof(line) : string)); static assert(line == "ABCDEF"); }
Jul 10 2014