digitalmars.D - C++Now! 2012 slides

bearophile (103/103) Jun 07 2012 The slide packs of the conference C++Now! 2012 are available:

Artur Skawina (5/6) Jun 07 2012 "restrict" functionality - yes, but it's probably better done the other ...

Francois chabot (6/15) Jun 07 2012 For this to work, the compiler would need to accurately imply

Artur Skawina (14/23) Jun 07 2012 Yes, it can't (always) be statically checked. But the alternative is res...

Peter Alexander (58/60) Jun 07 2012 I very much liked that presentation. It's nice to see someone

Timon Gehr (5/11) Jun 07 2012 The mostNegative and common type stuff is there to enable correct

Peter Alexander (10/25) Jun 07 2012 I know what it's there for, I don't think it is necessary. I

Timon Gehr (10/34) Jun 07 2012 There is not much there to clutter in this case. Therefore cluttering

Peter Alexander (28/42) Jun 07 2012 Aesthetics is part of it, but it's more than that.

Andrei Alexandrescu (5/10) Jun 07 2012 [snip]

Peter Alexander (92/94) Jun 07 2012 Ok...

Andrei Alexandrescu (56/143) Jun 08 2012 This overload is an optimization that exploits the fact that comparing
Jonathan M Davis (15/16) Jun 08 2012 It's _great_ that functions like find are variadic. It makes them more

Andrei Alexandrescu (7/21) Jun 07 2012 I agree. BTW Howard Hinnant's proposal N2199

Peter Alexander (6/9) Jun 07 2012 Then how come it has a bug where it doesn't work with

Andrei Alexandrescu (7/15) Jun 07 2012 I just disagree. It's not complicated for what it does. It simply sets
Jonathan M Davis (13/23) Jun 07 2012 Because it wasn't properly tested for that. Anything can have a bug in i...

Dmitry Olshansky (27/85) Jun 07 2012 C++ has far worse size to power coefficient. The result is that generic

Froglegs (4/4) Jun 07 2012 For some reason I found these docs really dull, to much
Jonathan M Davis (15/24) Jun 07 2012 std.algorithm does a lot of really cool stuff in a very generic way, and...

Dmitry Olshansky (38/40) Jun 07 2012 Thanks, nice stuff for my brain to chew on.

Artur Skawina (5/6) Jun 07 2012 Until you have enough of them and your program starts to spend most of i...
Dmitry Olshansky (18/23) Jun 07 2012 Continuing observing pages, noting apparent C++ 11 decadence. Yeah, they...

Paulo Pinto (6/18) Jun 10 2012 Big companies have a lot of NIH syndrome.

"bearophile" <bearophileHUGS lycos.com> writes:

The slide packs of the conference C++Now! 2012 are available:
https://github.com/boostcon/cppnow_presentations_2012

Some of those slides packs seems too much large for the GitHub 
blob serving interface. To solve this problem download them all 
from as zip here (large amount of stuff):
https://github.com/boostcon/cppnow_presentations_2012/downloads

Among those many (sometimes large) slides packs there is some 
good stuff. I think some of that stuff is interesting for D 
programmers too. I have chosen four of them.

----------------------------

"50 Boost C++ Libraries in 180 minutes", by Boris Schaling:

Among the Libraries shown, in Phobos I'd like the 12 libraries:

- Boost.Pool Memory management with an allocator based on a 
singleton.
- Boost.Multiindex Create new containers which provide multiple 
interfaces to lookup items.
- Boost.Bimap A ready-to-use container based on Boost.Multiindex 
with exactly two interfaces.
- Boost.CircularBuffer A fixed-size container which overwrites 
items if you keep on inserting more (I'd like a dynamic version 
too of this).
- Boost.DynamicBitset Works exactly like std::bitset except that 
the size can be set (and modified) at run-time (in Phobos I'd 
like a in-place-allocated fixed-size one too).
- Boost.Flyweight Flyweight pattern: Sharing common data between 
objects to minimize memory usage.
- Boost.Accumulators Containers which calculate new results 
whenever a new value is pushed into them.
- Boost.Rational Use exact representations of rational numbers 
like 1/3 in C++.
- Boost.Graph A library to solve problems like finding the 
shortest route between two subway stations.
- Boost.Conversion Three cast operators for numbers and 
polymorphic types.
- Boost.MinMax Find the minimum and maximum of two or multiple 
values with one function call.
- Boost.Hash Classes and functions to return hash values and to 
build your own for user-defined types.

----------------------------

"Metaprogramming Applied to Numerical Problems - A Generic 
Implementation of Runge-Kutta Algorithms", by Mario Mulansky and 
Karsten Ahnert: this should be not hard to do in D, with less and 
simpler code.

----------------------------

"More Useful Computations in the Same Duration: Optimizing 
Embedded Hard Real-Time Code in C++", by Scott Schurr:

Page 46: #pragma no_alias Are we going to need something like 
that (or "restrict") in D too?


Page 55, Managing Code Bloat:

Maybe D should offer a built-in solution for this common C++ 
coding pattern:

Class TCB_impl {
     template<int chan> friend class TCB;
     inline void status(int chan) { ... }
     void source(int chan, int st) { ... }
};

template <int chan>
class TCB {
public:
     inline void status() { impl_.status(chan); }
     inline void source(int st) { impl_.start(chan, st); }
private:
     TCB_impl impl_;
};


Some time ago I suggested the attribute  templated(). If inside 
its () there are no variable names then that function/attribute 
is not templated, so it's like the functions/attributes of 
TCB_impl:


class TCB(int chan) {
public:
     void status() { status_impl(chan); }
      templated() void status_impl(int chan) { ... }

     void source(int st) { source_impl(chan, st); }
      templated() void source_impl(int chan, int st) { ... }
}


But maybe something better is possible:

class TCB(int chan) {
      templated() immutable mchan = chan;
public:
     this() { this.mchan = chan; }
      templated() void status() { /*uses mchan*/... }
      templated() void source(int st) { /*uses mchan*/... }
}

----------------------------

"Now What?" by Sean Parent (Adobe).

Slide 29 (page 18), "Desktop Compute Power" shows that nowdays 
for some tasks GPU and vectorized code are much faster than 
regular sequential C++ code.


Slides 36 and 38:
That typical object oriented paradigms of using shared references 
to objects breaks down in a massively parallel environment
* Sharing implies either single threaded
* Or synchronization
To utilize the hardware we need to move towards functional, 
declarative, reactive, and value semantic programming No raw loops


Slide 40 (page 25) adds to slide 29: "Without addressing 
vectorization, GPGPU, and scalable parallelismW standard C++ is 
just a scripting system to get to the other 99% of the machine 
through other languages and libraries. Do we need such a complex 
scripting system?"

The same question is valid for D. It seems important.

Bye,
bearophile

Jun 07 2012

Artur Skawina <art.08.09 gmail.com> writes:

On 06/07/12 18:04, bearophile wrote:
 Page 46: #pragma no_alias Are we going to need something like that (or
"restrict") in D too?

"restrict" functionality - yes, but it's probably better done the other way
around -
restrict as default and "alias" when necessary. Reusing the keyword could work
too,
if template parameters is the only thing that can clash.

artur

Jun 07 2012

"Francois chabot" <Francois.chabot.dev gmail.com> writes:

On Thursday, 7 June 2012 at 17:17:07 UTC, Artur Skawina wrote:
 On 06/07/12 18:04, bearophile wrote:
 Page 46: #pragma no_alias Are we going to need something like 
 that (or "restrict") in D too?

 "restrict" functionality - yes, but it's probably better done 
 the other way around -
 restrict as default and "alias" when necessary. Reusing the 
 keyword could work too,
 if template parameters is the only thing that can clash.

 artur

For this to work, the compiler would need to accurately imply 
aliasing with a 100% accuracy, to spit out errors. And if that 
was possible we would not need programmers hints at all. With 
restrict as a keyword, at least the programmer is aware of how 
"dangerous" that code is.

Jun 07 2012

Artur Skawina <art.08.09 gmail.com> writes:

On 06/07/12 20:19, Francois chabot wrote:
 On Thursday, 7 June 2012 at 17:17:07 UTC, Artur Skawina wrote:
 On 06/07/12 18:04, bearophile wrote:
 Page 46: #pragma no_alias Are we going to need something like that (or
"restrict") in D too?

 "restrict" functionality - yes, but it's probably better done the other way
around -
 restrict as default and "alias" when necessary. Reusing the keyword could work
too,
 if template parameters is the only thing that can clash.

 
 For this to work, the compiler would need to accurately imply aliasing with a
100% accuracy, to spit out errors. And if that was possible we would not need
programmers hints at all. With restrict as a keyword, at least the programmer
is aware of how "dangerous" that code is.

Yes, it can't (always) be statically checked. But the alternative is restrict
proliferation; it applies not only to pointers, but also to slices. Hence
defining
the language in a way that reduces the need for extra annotations would be best.
I'm not sure if that can be done safely for at least some subset of cases
(function
arguments, mostly), that's why I said "probably" above. But it is something that
should be explored before blindly transplanting "restricted" to D.
While static compile-time checks aren't always possible, the rest could in
theory
be handled by run-time checks in non-release mode. I'm not sure it would be
enough.
A way to explicitly tag things as not aliasing anything else would also be
handy;
'restricted' would work for that, maybe even 'uniq' could be overloaded and
help.
This is eg for pointers extracted from some data structures, which are known to
not alias any others of the same kind present the current scope.

artur

Jun 07 2012

"Peter Alexander" <peter.alexander.au gmail.com> writes:

On Thursday, 7 June 2012 at 16:05:00 UTC, bearophile wrote:
 "Now What?" by Sean Parent (Adobe).

I very much liked that presentation. It's nice to see someone 
looking at C++ in the big picture.

I also liked his comment on the "beauty" of std::pair

"Complete std::pair 372 Lines"

D suffers from this too. Here is std.algorithm.min (ignoring the 
definitions of CommonType, mostNegative, and isIntegral).


template MinType(T...)
{
     static assert(T.length >= 2);
     static if (T.length == 2)
     {
         static if (!is(typeof(T[0].min)))
             alias CommonType!(T[0 .. 2]) MinType;
         else static if (mostNegative!(T[1]) < mostNegative!(T[0]))
             alias T[1] MinType;
         else static if (mostNegative!(T[1]) > mostNegative!(T[0]))
             alias T[0] MinType;
         else static if (T[1].max < T[0].max)
             alias T[1] MinType;
         else
             alias T[0] MinType;
     }
     else
     {
         alias MinType!(MinType!(T[0 .. 2]), T[2 .. $]) MinType;
     }
}

MinType!(T1, T2, T) min(T1, T2, T...)(T1 a, T2 b, T xs)
{
     static if (T.length == 0)
     {
         static if (isIntegral!(T1) && isIntegral!(T2)
                    && (mostNegative!(T1) < 0) != 
(mostNegative!(T2) < 0))
             static if (mostNegative!(T1) < 0)
                 immutable chooseB = b < a && a > 0;
             else
                 immutable chooseB = b < a || b < 0;
         else
                 immutable chooseB = b < a;
         return cast(typeof(return)) (chooseB ? b : a);
     }
     else
     {
         return min(min(a, b), xs);
     }
}


I find this very ugly. To be honest, I would be much happier 
without all that mostNegative and common type stuff. If I want to 
get the min between a short and an int I'll just cast them 
appropriately. I'd much rather have a simpler standard library 
than a complicated one for the sake of a little convenience.

Don't get me started on std.algorithm.find...


 The same question is valid for D. It seems important.

It is. D addresses vectorization a little with its array ops 
(although ISPC (http://ispc.github.com/) destroys both D and C++ 
in this arena) and we're yet to see if D provides scalable 
parallelism.

Jun 07 2012

Timon Gehr <timon.gehr gmx.ch> writes:

On 06/07/2012 08:34 PM, Peter Alexander wrote:
 I find this very ugly. To be honest, I would be much happier without all
 that mostNegative and common type stuff.
 If I want to get the min
 between a short and an int I'll just cast them appropriately.

The mostNegative and common type stuff is there to enable correct 
semantics for taking the minimum of signed and unsigned values. Minimum 
of short and int would work fine without that.

 I'd much rather have a simpler standard library than a complicated one for the
 sake of a little convenience.

'min' is not complicated.

Jun 07 2012

"Peter Alexander" <peter.alexander.au gmail.com> writes:

On Thursday, 7 June 2012 at 20:04:56 UTC, Timon Gehr wrote:
 On 06/07/2012 08:34 PM, Peter Alexander wrote:
 I find this very ugly. To be honest, I would be much happier 
 without all
 that mostNegative and common type stuff.
 If I want to get the min
 between a short and an int I'll just cast them appropriately.

 The mostNegative and common type stuff is there to enable 
 correct semantics for taking the minimum of signed and unsigned 
 values. Minimum of short and int would work fine without that.

I know what it's there for, I don't think it is necessary. I 
don't think people should be taking the min of signed and 
unsigned values. I don't think it's worth cluttering the 
implementation for this minor convenience.


 I'd much rather have a simpler standard library than a 
 complicated one for the
 sake of a little convenience.

 'min' is not complicated.

It's more complicated than it could/should be:

T min(T)(T x, T y) { return x < y ? x : y; }

To be honest, I don't think the variadic argument versions are 
necessary either as it is just duplicating the functionality of 
reduce.

Jun 07 2012

Timon Gehr <timon.gehr gmx.ch> writes:

On 06/07/2012 10:53 PM, Peter Alexander wrote:
 On Thursday, 7 June 2012 at 20:04:56 UTC, Timon Gehr wrote:
 On 06/07/2012 08:34 PM, Peter Alexander wrote:
 I find this very ugly. To be honest, I would be much happier without all
 that mostNegative and common type stuff.
 If I want to get the min
 between a short and an int I'll just cast them appropriately.

 The mostNegative and common type stuff is there to enable correct
 semantics for taking the minimum of signed and unsigned values.
 Minimum of short and int would work fine without that.

 I know what it's there for, I don't think it is necessary. I don't think
 people should be taking the min of signed and unsigned values.

What is important is that they can. ;)

 I don't think it's worth cluttering the implementation for this minor
convenience.

There is not much there to clutter in this case. Therefore cluttering
does not have a significant drawback. Is it just about aesthetics?

 I'd much rather have a simpler standard library than a complicated
 one for the
 sake of a little convenience.

 'min' is not complicated.

 It's more complicated than it could/should be:

 T min(T)(T x, T y) { return x < y ? x : y; }

This implementation is unstable. If you fix that, then I'll agree that
it is sufficient for most cases. There is not really a reason to
restrict the generality of the implementation to this though. What we
have is a strict superset.

 To be honest, I don't think the variadic argument versions are necessary
 either as it is just duplicating the functionality of reduce.

It is not really duplicating the functionality. Still, I haven't used
these versions so far either.

Jun 07 2012

"Peter Alexander" <peter.alexander.au gmail.com> writes:

On Thursday, 7 June 2012 at 21:22:57 UTC, Timon Gehr wrote:
 There is not much there to clutter in this case. Therefore 
 cluttering
 does not have a significant drawback. Is it just about 
 aesthetics?

Aesthetics is part of it, but it's more than that.

It makes the code more difficult to read. I would need to step 
through the definition of MinType to find out what min(uint, int) 
returns.

It adds to compile times and compiler memory usage (it all adds 
up...)

It adds more code paths that need to be tested/debugged. Bugs 
like this for example: 
http://d.puremagic.com/issues/show_bug.cgi?id=8158 (note: it 
compiles with my simple min). Also note how the error message 
leaks all the clutter from min, making it difficult to tell 
exactly what the error is. There would be no complicated errors 
from my version, no matter what you did.

It makes it difficult to reason about how different types work 
with the function (e.g. with Date). With the complicated version 
in std.algorithm I have to now worry about what isIntegral 
returns for my type, and what mostNegative returns. If I create a 
BigInteger type and a BigUnsignedInteger type, how do I make sure 
it works correctly if I use those two as arguments to min? I'm 
sure the answer is simple, but finding that answer is non-trivial.


 It's more complicated than it could/should be:

 T min(T)(T x, T y) { return x < y ? x : y; }

 This implementation is unstable. If you fix that, then I'll 
 agree that
 it is sufficient for most cases. There is not really a reason to
 restrict the generality of the implementation to this though. 
 What we
 have is a strict superset.

Good spot on the stability.

There's nothing wrong with a strict superset, we just have to 
admit that there is a non-zero cost to this extra generality. I 
have listed some of the costs above. Some are arguable, but it is 
inarguable that at least one bug has arisen as a direct result of 
its complexity. I suspect there will be more over time. Multiply 
that with all the functions that try to be excessively general.

Jun 07 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 6/7/12 5:19 PM, Peter Alexander wrote:
 On Thursday, 7 June 2012 at 21:22:57 UTC, Timon Gehr wrote:
 There is not much there to clutter in this case. Therefore cluttering
 does not have a significant drawback. Is it just about aesthetics?

 Aesthetics is part of it, but it's more than that.

 It makes the code more difficult to read.

[snip]

Great points, example could be a lot better. Maybe it's time you do get 
started on find(). Destroy or be destroyed.

Andrei

Jun 07 2012

"Peter Alexander" <peter.alexander.au gmail.com> writes:

On Thursday, 7 June 2012 at 22:29:09 UTC, Andrei Alexandrescu 
wrote:
 Great points, example could be a lot better. Maybe it's time 
 you do get started on find(). Destroy or be destroyed.

Ok...

This overload:

R find(alias pred = "a == b", R, E)(R haystack, E needle)
if (isInputRange!R &&
         is(typeof(binaryFun!pred(haystack.front, needle)) : bool))
{
     for (; !haystack.empty; haystack.popFront())
     {
         if (binaryFun!pred(haystack.front, needle)) break;
     }
     return haystack;
}


Is fine. In fact it's practically perfect. It's the obvious 
solution to a simple problem. The biggest problem is the "is 
typeof" syntax, but that's not your fault.

Second overload:

R1 find(alias pred = "a == b", R1, R2)(R1 haystack, R2 needle)
if (isForwardRange!R1 && isForwardRange!R2
         && is(typeof(binaryFun!pred(haystack.front, 
needle.front)) : bool)
         && !isRandomAccessRange!R1)
{
     static if (is(typeof(pred == "a == b")) && pred == "a == b" 
&& isSomeString!R1 && isSomeString!R2
             && haystack[0].sizeof == needle[0].sizeof)
     {
         //return cast(R1) find(representation(haystack), 
representation(needle));
         // Specialization for simple string search
         alias Select!(haystack[0].sizeof == 1, ubyte[],
                 Select!(haystack[0].sizeof == 2, ushort[], 
uint[]))
             Representation;
         // Will use the array specialization
         return cast(R1) .find!(pred, Representation, 
Representation)
             (cast(Representation) haystack, cast(Representation) 
needle);
     }
     else
     {
         return simpleMindedFind!pred(haystack, needle);
     }
}

As far as I can tell, this is a proxy for various substring 
search implementations.

Problem 1: Why is find overloaded to do substring search? Why 
does it do substring and not subset or subsequence? substring is 
a completely different algorithm from linear search and even has 
different asymptotic running time. This is needlessly overloaded, 
it adds nothing.

Problem 2: The attempted specialisation is atrocious. It compares 
the predicate string with "a == b". I just did a quick check, and 
this means that these two calls DO NOT use the same algorithm:

string a = ..., b = ...;
auto fast = find!"a == b"(a, b);
auto slow = find!"a==b"(a, b);

I have to be honest, I have only just noticed this, but that's 
really sloppy.

It's also a direct symptom of over-complicated code. As a user, I 
would 100% expect these calls to be the same. If I accidentally 
used the second version, I would have no warning: my code would 
just run slower and I'd be none the wiser. Only upon careful 
inspection of the source could you discover what this "simple" 
code does, and you would be horrified like I am now.

The next two overloads of find appears to implement a couple of 
reasonably fast substring. Again, it would take me a minute or so 
to figure out exactly what algorithm was being used.

After that there's a multi-range find. Seems simple enough, but 
it's yet another overload to consider and I'm not sure it's 
commonly used enough to even warrant existence. I'd hate to think 
what would happen if I wanted the single-range search but 
accidentally added an extra parameter.

In summary: the specialisation detection is shocking, which leads 
me to question what other static ifs are accidentally firing or 
failing. If the code was more simple, and less overloaded it 
would be easy to reason about all this, but it's not. The various 
find implementations span a few hundred lines, and all have 
numerous compile time branches and checks. The cyclomatic 
complexity must be huge.

How I'd do things:

- The first version of find would be the only one. No overloads, 
no specialisation.
- substring would be a separate, non-predicated function. It 
would be specialised for strings. I'm too tired now to think 
about the variety of range specialisations, but I'm sure there's 
a more elegant way to handle to combinations.
- Drop the variadic find altogether.
- Get rid of BoyerMooreFinder + find overload, replace with a 
boyerMoore function.

Jun 07 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 6/7/12 6:50 PM, Peter Alexander wrote:
 On Thursday, 7 June 2012 at 22:29:09 UTC, Andrei Alexandrescu wrote:
 Great points, example could be a lot better. Maybe it's time you do
 get started on find(). Destroy or be destroyed.

 Ok...

 This overload:

 R find(alias pred = "a == b", R, E)(R haystack, E needle)
 if (isInputRange!R &&
 is(typeof(binaryFun!pred(haystack.front, needle)) : bool))
 {
 for (; !haystack.empty; haystack.popFront())
 {
 if (binaryFun!pred(haystack.front, needle)) break;
 }
 return haystack;
 }


 Is fine. In fact it's practically perfect. It's the obvious solution to
 a simple problem. The biggest problem is the "is typeof" syntax, but
 that's not your fault.

So far so good.

 Second overload:

 R1 find(alias pred = "a == b", R1, R2)(R1 haystack, R2 needle)
 if (isForwardRange!R1 && isForwardRange!R2
 && is(typeof(binaryFun!pred(haystack.front, needle.front)) : bool)
 && !isRandomAccessRange!R1)
 {
 static if (is(typeof(pred == "a == b")) && pred == "a == b" &&
 isSomeString!R1 && isSomeString!R2
 && haystack[0].sizeof == needle[0].sizeof)
 {
 //return cast(R1) find(representation(haystack), representation(needle));
 // Specialization for simple string search
 alias Select!(haystack[0].sizeof == 1, ubyte[],
 Select!(haystack[0].sizeof == 2, ushort[], uint[]))
 Representation;
 // Will use the array specialization
 return cast(R1) .find!(pred, Representation, Representation)
 (cast(Representation) haystack, cast(Representation) needle);
 }
 else
 {
 return simpleMindedFind!pred(haystack, needle);
 }
 }

 As far as I can tell, this is a proxy for various substring search
 implementations.

 Problem 1: Why is find overloaded to do substring search? Why does it do
 substring and not subset or subsequence? substring is a completely
 different algorithm from linear search and even has different asymptotic
 running time. This is needlessly overloaded, it adds nothing.

This overload is an optimization that exploits the fact that comparing 
two strings for equality is the same as comparing their representations.

 Problem 2: The attempted specialisation is atrocious. It compares the
 predicate string with "a == b".

Yah, that's to detect the defaulted parameter (not to detect all 
comparisons for equality). I should have defined a separate overload 
that took one less parameter, but back when I wrote find() there were 
some compiler bugs that prevented me from doing so.

 I just did a quick check, and this means
 that these two calls DO NOT use the same algorithm:

 string a = ..., b = ...;
 auto fast = find!"a == b"(a, b);
 auto slow = find!"a==b"(a, b);

 I have to be honest, I have only just noticed this, but that's really
 sloppy.

Again, the point there is to detect the default (most frequently used), 
not all equality comparisons (which would be a much harder task). 
Probably it's worth revisiting this implementation at a point and do it 
the right way: an overload of find() for strings when a lambda is not 
specified.

 It's also a direct symptom of over-complicated code. As a user, I would
 100% expect these calls to be the same.

Why? Would you expect that these calls are also the same?

find!((a, b) => a == b)(a, b);
find!((a, b) => b == a)(a, b);
find!((a, b) => !(b != a))(a, b);
auto lambda = (a, b) => a == b;
find!lambda(a, b);

I'm sure you figure how difficult the task of detecting all semantically 
equivalent lambdas are. The point there is, again, to optimize the 
frequent case when people just call find(a, b). That's it.

 If I accidentally used the
 second version, I would have no warning: my code would just run slower
 and I'd be none the wiser. Only upon careful inspection of the source
 could you discover what this "simple" code does, and you would be
 horrified like I am now.

I don't see any reason that justifies one being horrified.

 The next two overloads of find appears to implement a couple of
 reasonably fast substring. Again, it would take me a minute or so to
 figure out exactly what algorithm was being used.

This is actually not bad at all. Standard library implementations are 
very highly leveraged, and writing them in a "simple" manner that a 
beginner can understand immediately is optimizing along the wrong 
dimensions. No language that's actually used does that, including those 
for which traditionally performance is not a front concern, such as Lisp 
or Haskell.

 After that there's a multi-range find. Seems simple enough, but it's yet
 another overload to consider and I'm not sure it's commonly used enough
 to even warrant existence. I'd hate to think what would happen if I
 wanted the single-range search but accidentally added an extra parameter.

Implementing multi-range properly is difficult by an end user and a 
natural thing to add to the standard library.

 In summary: the specialisation detection is shocking, which leads me to
 question what other static ifs are accidentally firing or failing. If
 the code was more simple, and less overloaded it would be easy to reason
 about all this, but it's not. The various find implementations span a
 few hundred lines, and all have numerous compile time branches and
 checks. The cyclomatic complexity must be huge.

I don't think you are putting forth good arguments to simplify things. 
Essentially you're saying you want to read the source of the standard 
library but don't care to have it run fast.

 How I'd do things:

 - The first version of find would be the only one. No overloads, no
 specialisation.

So out the window is fast string searching and many other simple 
optimizations such as e.g. looking for the last element of a 
bidirectional sequence in a random-access range first.

 - substring would be a separate, non-predicated function. It would be
 specialised for strings. I'm too tired now to think about the variety of
 range specialisations, but I'm sure there's a more elegant way to handle
 to combinations.

So now when we have strings we can use the slow find and the fast 
substring for essentially the same operation? Why? "Well we thought it 
makes the standard library implementation one bit simpler." Ehm.

 - Drop the variadic find altogether.

I guess we could do that, but again, it's difficult to implement 
correctly by the user. Then there's the backwards compatibility to worry 
about.

 - Get rid of BoyerMooreFinder + find overload, replace with a boyerMoore
 function.

That's an experiment that I haven't decided whether to declare 
successful or not. I figured the fundamental optimizations of find(a, b) 
are effected as follows:

1. Add structure to a (e.g. a is sorted, has a trie associated with it etc);

2. Add structure to b (e.g. KMP, BM, RK)

3. Add structure to both (e.g. search a sorter range in another)

The decision on which algorithm to use is often dictated by this concern 
- where am I willing to focus the preprocessing effort? So I thought I'd 
clarify that in the way find(a, b) is invoked: depending on the 
structure of a and b, different search decisions are taken.

I'd been planning to add several other find implementations along these 
lines, but haven't gotten around to it yet.


Andrei

Jun 08 2012

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Friday, June 08, 2012 01:50:36 Peter Alexander wrote:
 - Drop the variadic find altogether.

It's _great_ that functions like find are variadic. It makes them more 
powerful, and they're still easy to use.

Ease of use, power, and performance are _far_ more important than code 
readibility when talking about the standard library. These functions are being 
used by _everyone_ and are maintained by a relatively small number of people. 
If you were arguing that they were hard to use, then that would be one thing, 
but arguing that they're hard to read is arguing about something which matters 
comparatively little.

And for as powerful as the functions in std.algorithm are, they don't tend to 
be all that complicated. find is certainly one of the most complicated 
(probably _the_ most complicated actually), but it's not complicated to use, 
and you get a lot of power out of it. And considering how insanely common it 
is to use find, having it be powerful is critical.

- Jonathan M Davis

Jun 08 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 6/7/12 3:04 PM, Timon Gehr wrote:
 On 06/07/2012 08:34 PM, Peter Alexander wrote:
 I find this very ugly. To be honest, I would be much happier without all
 that mostNegative and common type stuff.
 If I want to get the min
 between a short and an int I'll just cast them appropriately.

 The mostNegative and common type stuff is there to enable correct
 semantics for taking the minimum of signed and unsigned values. Minimum
 of short and int would work fine without that.

 I'd much rather have a simpler standard library than a complicated one
 for the
 sake of a little convenience.

 'min' is not complicated.

I agree. BTW Howard Hinnant's proposal N2199 
(http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2199.html) 
introduces correct C++ min/max functions that take 175 lines and 
introduce 10 types and 12 specializations. (Proposal was rejected.) The 
power of min/max is about on a par with D's.

Andrei

Jun 07 2012

"Peter Alexander" <peter.alexander.au gmail.com> writes:

On Thursday, 7 June 2012 at 22:08:10 UTC, Andrei Alexandrescu 
wrote:
 On 6/7/12 3:04 PM, Timon Gehr wrote:
 'min' is not complicated.

 I agree.

Then how come it has a bug where it doesn't work with 
user-defined types?

Maybe it isn't complicated per your definition, but it's 
complicated enough that it has a bug for a prominent use case.

Jun 07 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 6/7/12 5:47 PM, Peter Alexander wrote:
 On Thursday, 7 June 2012 at 22:08:10 UTC, Andrei Alexandrescu wrote:
 On 6/7/12 3:04 PM, Timon Gehr wrote:
 'min' is not complicated.

 I agree.

 Then how come it has a bug where it doesn't work with user-defined types?

I just disagree. It's not complicated for what it does. It simply sets 
out to solve a number of situations, and it solves them in pretty much 
as many words as the English description thereof.

 Maybe it isn't complicated per your definition, but it's complicated
 enough that it has a bug for a prominent use case.

Complication is not necessarily the reason for bugs. Your one-liner had 
a bug.


Andrei

Jun 07 2012

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Friday, June 08, 2012 00:47:06 Peter Alexander wrote:
 On Thursday, 7 June 2012 at 22:08:10 UTC, Andrei Alexandrescu
 
 wrote:
 On 6/7/12 3:04 PM, Timon Gehr wrote:
 'min' is not complicated.

 
 I agree.

 
 Then how come it has a bug where it doesn't work with
 user-defined types?

Because it wasn't properly tested for that. Anything can have a bug in it. 
It's true that the more complicated something is, the more likely it is that 
it's going to have bugs, but the fact that something has a bug doesn't 
necessarily mean that it's overly complicated.

I admit that it surprised me how complicated std.algorithm.min is, but I don't 
think that it's all that bad. That extra complication does allow it to handle 
comparing signed and unsigned types more correctly. It's just a matter of 
fixing this one bug and adding proper tests so that it doesn't break like this 
again. And there's already a pull request to fix that (though it appears that I 
forgot to put a comment for it in the bug):

https://github.com/D-Programming-Language/phobos/pull/612

- Jonathan M Davis

Jun 07 2012

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 07.06.2012 22:34, Peter Alexander wrote:
 On Thursday, 7 June 2012 at 16:05:00 UTC, bearophile wrote:
 "Now What?" by Sean Parent (Adobe).

 I very much liked that presentation. It's nice to see someone looking at
 C++ in the big picture.

 I also liked his comment on the "beauty" of std::pair

C++ has far worse size to power coefficient. The result is that generic 
programming is 2-10x longer ( + some pathological cases with up to 
~100). Not to mention readability.
I mean Boost doesn't do anything VERY advanced, it's just hard to keep 
focus on your goal while bending C++ compiler to your needs.

 "Complete std::pair 372 Lines"

 D suffers from this too. Here is std.algorithm.min (ignoring the
 definitions of CommonType, mostNegative, and isIntegral).


 template MinType(T...)
 {
 static assert(T.length >= 2);
 static if (T.length == 2)
 {
 static if (!is(typeof(T[0].min)))
 alias CommonType!(T[0 .. 2]) MinType;
 else static if (mostNegative!(T[1]) < mostNegative!(T[0]))
 alias T[1] MinType;
 else static if (mostNegative!(T[1]) > mostNegative!(T[0]))
 alias T[0] MinType;
 else static if (T[1].max < T[0].max)
 alias T[1] MinType;
 else
 alias T[0] MinType;
 }
 else
 {
 alias MinType!(MinType!(T[0 .. 2]), T[2 .. $]) MinType;
 }
 }

 MinType!(T1, T2, T) min(T1, T2, T...)(T1 a, T2 b, T xs)
 {
 static if (T.length == 0)
 {
 static if (isIntegral!(T1) && isIntegral!(T2)
 && (mostNegative!(T1) < 0) != (mostNegative!(T2) < 0))
 static if (mostNegative!(T1) < 0)
 immutable chooseB = b < a && a > 0;
 else
 immutable chooseB = b < a || b < 0;
 else
 immutable chooseB = b < a;
 return cast(typeof(return)) (chooseB ? b : a);
 }
 else
 {
 return min(min(a, b), xs);
 }
 }


 I find this very ugly.

Very straightforward, very convenient and (importantly) optimal result. 
Like in "I would have written it by hand exactly the same" barring 
generic grease like CommonType & mostNegative that resolves to nothing 
most of the time.

To be honest, I would be much happier without all
 that mostNegative and common type stuff. If I want to get the min
 between a short and an int I'll just cast them appropriately. I'd much
 rather have a simpler standard library than a complicated one for the
 sake of a little convenience.

STL & (to be frank )CRT is damn contrived. Ever tried reading printf 
source from GLIBC ? A polite recommendation: you'd better keep you 
favorite sedative nearby while you are at it.
Yet memset is fast as lightning and handles ton of cases with ease, and 
I don't have to think about it. The whole point of libraries FWIW.

 Don't get me started on std.algorithm.find...

Why not ? It's perfect example of goodness of generic programing, 
flexibility (for free) + speciality (when you need) for the win.
If you language lacks such construct you end up inveting ways around it:
- make findInt, findString, quickFindString, findPerson etc. * number of 
containters
- shoehorn everything into a simpler design: qsort, boxing-unboxing etc.


 The same question is valid for D. It seems important.

 It is. D addresses vectorization a little with its array ops (although
 ISPC (http://ispc.github.com/) destroys both D and C++ in this arena)
 and we're yet to see if D provides scalable parallelism.

std.parallelism? It handles SMP without breaking a sweat.
GPGPU would be real nice though. And things like restrict(...) from C++ 
AMP could help, or better support for CTFE + DSLs.

-- 
Dmitry Olshansky

Jun 07 2012

"Froglegs" <lugtug yahoo.com> writes:

  For some reason I found these docs really dull, to much 
rehashing of C++ 11 which is old hat now.

  The template stuff might be interesting if not for the knowledge 
that it will take absolutely forever to compile--

Jun 07 2012

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Friday, June 08, 2012 00:30:06 Dmitry Olshansky wrote:
 On 07.06.2012 22:34, Peter Alexander wrote:
 Don't get me started on std.algorithm.find...

 
 Why not ? It's perfect example of goodness of generic programing,
 flexibility (for free) + speciality (when you need) for the win.
 If you language lacks such construct you end up inveting ways around it:
 - make findInt, findString, quickFindString, findPerson etc. * number of
 containters
 - shoehorn everything into a simpler design: qsort, boxing-unboxing etc.

std.algorithm does a lot of really cool stuff in a very generic way, and on the 
whole, it works very well. It's true that its functions tend to be more 
complicated than most, but they're generally not complicated to use, and if 
you're going to have that kind of complication, it's best to have it in the 
standard library written by experts in the language rather than trying to 
write it yourself. And D manages to make such functions _very_ sane in 
comparison to what C++ would need to do to get the same result. If anything, 
the problem here is simply that not all of these functions are tested as well 
as they should be (as the bug with min shows).

Personally, my favorite function in std.algorithm is startsWith (or endsWith). 
It just about blew my mind when I realized what it was doing. It is 
fantastically cool how it uses recursive templates to get its job done, and it 
makes for an incredibly powerful and easy to use function.

- Jonathan M Davis

Jun 07 2012

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 07.06.2012 20:04, bearophile wrote:
 The slide packs of the conference C++Now! 2012 are available:
 https://github.com/boostcon/cppnow_presentations_2012

Thanks, nice stuff for my brain to chew on.

For one thing the ustring caught my eye. It goes in the right direction 
... for C++.

My first observations on the matter:

1) We all know COW strings are not good... because in C++ you have 
damned shared by default, folks!
That implies using atomic refcounting to be on the safe side, yet in D 
we have thread local by default, and ref-counting/COW is perfectly fine.
(for non-shared types).

2) Immutable strings have performance cost ... like tracking ownership 
and reference counting - unless you have GC by your side, in which case 
it's cheap and legal :)

3) Embeddable strings are not so hard to get in D, since there are 
char[...] arrays (that are even correctly value types!).

4) Small string optimization is kind of cool... for mutable strings 
mostly or as long as your GC sucks. (Small strings kind of prevent 
pooling, you know)

All that being said I could envision FlexString type in D, that does the 
following:
- easily integrates with string
- small string optimization (yay!)
- (maybe) allow any known encoding including legacy (and the list is 
extendable), convert on demand
- has plugable allocator (Andrei where was that nice design ? ;) )
- thus it could live on stack

Unlike c++ folks, we don't need:
- find and etc. to be members of string
- I personally against legacy encoding, let's stick with utf-8/utf-16
- customizable grow rates? (strange problem as I think preallocation is 
your beast friend where it may matter)

As far as small string goes I'd say a lot of containers would benefit of 
such packing, could be nice to generalize the concept.

The end of wish list:

Rope class would be cool, like... real cool. I guess std.container has 
to be package though :)


-- 
Dmitry Olshansky

Jun 07 2012

Artur Skawina <art.08.09 gmail.com> writes:

On 06/07/12 22:46, Dmitry Olshansky wrote:
 2) Immutable strings have performance cost ... like tracking ownership and
reference counting - unless you have GC by your side, in which case it's cheap
and legal :)

Until you have enough of them and your program starts to spend most of its
time trying to GC them. [1]

artur

[1[ OK, so you might need 13M+ strings totalling 450M+. Still.. ;)

Jun 07 2012

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 08.06.2012 0:46, Dmitry Olshansky wrote:
 On 07.06.2012 20:04, bearophile wrote:
 The slide packs of the conference C++Now! 2012 are available:
 https://github.com/boostcon/cppnow_presentations_2012

 Thanks, nice stuff for my brain to chew on.

Continuing observing pages, noting apparent C++ 11 decadence. Yeah, they 
did an amazing job, but there is a ton of problems yet to fix.
(most things are "we should have had this long ago but here it is...")

Now imagine the size of language once it has modules, new concepts and 
some fresh kitchen sink library additions... On top of all that a sign : 
"here be backwards compatibility".

Anyway, other thoughts:
- C++11 allocators - interesting, how do our upcoming compare with it? 
At least we should have more freedom in design.

- It's fun to see M$ guy give a talk about regex. Last time I checked 
their STL it was a bleak shadow of boost ones in terms of performance 
(why not just copy good stuff if you can't get it right?)

- about metaparse - It's quite delightful to see how these guys have to 
bend themselves in a suicide acrobatics to get C++ to parse DSL at 
compile time. Best slides ever :)


-- 
Dmitry Olshansky

Jun 07 2012

"Paulo Pinto" <pjmlp progtools.org> writes:

On Thursday, 7 June 2012 at 22:46:33 UTC, Dmitry Olshansky wrote:
 On 08.06.2012 0:46, Dmitry Olshansky wrote:
 On 07.06.2012 20:04, bearophile wrote:
 The slide packs of the conference C++Now! 2012 are available:
 https://github.com/boostcon/cppnow_presentations_2012

 Thanks, nice stuff for my brain to chew on.

 - It's fun to see M$ guy give a talk about regex. Last time I 
 checked their STL it was a bleak shadow of boost ones in terms 
 of performance (why not just copy good stuff if you can't get 
 it right?)

Big companies have a lot of NIH syndrome.

Plus many, specially Microsoft, don't like to use licenses other
than their own.

--
Paulo

Jun 10 2012

D Programming

C/C++ Programming

Other

digitalmars.D - C++Now! 2012 slides