www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - From C++14 and Java 1.8

reply "bearophile" <bearophileHUGS lycos.com> writes:
Through Reddit I've seen this interesting report about C++14:

http://isocpp.org/blog/2013/04/trip-report-iso-c-spring-2013-meeting


Variable templates: sometimes in D I've felt a little desire for 
a shorter syntax, a templated alias:

alias Foo(T) = Bar!(T, "red");

But in the end I think this that syntax sugar isn't so necessary.

- - - - - - - - -

Dynamic Arrays: they replace the variable-sized stack-allocated 
arrays of C99. I'd like something like this in D2 (Issue 9832) 
with a better integration with the type system.

- - - - - - - - -

optional<T>: see the little maybeTo helper function I have 
suggested in
http://d.puremagic.com/issues/show_bug.cgi?id=6840

-----------------------------------

 From another thread about the future Java 1.8 I've seen Java 
parallelSort:

http://blog.sanaulla.info/2013/04/08/arrays-sort-versus-arrays-parallelsort/


That blog post explains quickly how it works:

<<
Arrays#parallelSort uses Fork/Join framework introduced in Java 7 
to assign the sorting tasks to multiple threads available in the 
thread pool. This is called eating your own dog food. Fork/Join 
implements a work stealing algorithm where in a idle thread can 
steal tasks queued up in another thread.
An overview of Arrays#parallelSort:

The method uses a threshold value and any array of size lesser 
than the threshold value is sorted using the Arrays#sort() API 
(i.e sequential sorting). And the threshold is calculated 
considering the parallelism of the machine, size of the array and 
is calculated as:

private static final int getSplitThreshold(int n) {
   int p = ForkJoinPool.getCommonPoolParallelism();
   int t = (p > 1) ? (1 + n / (p << 3)) : n;
   return t < MIN_ARRAY_SORT_GRAN ? MIN_ARRAY_SORT_GRAN : t;
}

Once its decided whether to sort the array in parallel or in 
serial, its now to decide how to divide the array in to multiple 
parts and then assign each part to a Fork/Join task which will 
take care of sorting it and then another Fork/Join task which 
will take care of merging the sorted arrays. The implementation 
in JDK 8 uses this approach:
- Divide the array into 4 parts.
- Sort the first two parts and then merge them.
- Sort the next two parts and then merge them.
And the above steps are repeated recursively with each part until 
the size of the part to sort is not lesser than the threshold 
value calculated above.


I think it's worth adding something similar as strategy of std.algorithm.sort. Bye, bearophile
Apr 21 2013
next sibling parent reply "dsimcha" <dsimcha yahoo.com> writes:
On Sunday, 21 April 2013 at 12:08:54 UTC, bearophile wrote:
 Arrays#parallelSort uses Fork/Join framework introduced in Java 
 7 to assign the sorting tasks to multiple threads available in 
 the thread pool. This is called eating your own dog food. 
 Fork/Join implements a work stealing algorithm where in a idle 
 thread can steal tasks queued up in another thread.
 An overview of Arrays#parallelSort:

 The method uses a threshold value and any array of size lesser 
 than the threshold value is sorted using the Arrays#sort() API 
 (i.e sequential sorting). And the threshold is calculated 
 considering the parallelism of the machine, size of the array 
 and is calculated as:

 private static final int getSplitThreshold(int n) {
   int p = ForkJoinPool.getCommonPoolParallelism();
   int t = (p > 1) ? (1 + n / (p << 3)) : n;
   return t < MIN_ARRAY_SORT_GRAN ? MIN_ARRAY_SORT_GRAN : t;
 }

 Once its decided whether to sort the array in parallel or in 
 serial, its now to decide how to divide the array in to 
 multiple parts and then assign each part to a Fork/Join task 
 which will take care of sorting it and then another Fork/Join 
 task which will take care of merging the sorted arrays. The 
 implementation in JDK 8 uses this approach:
 - Divide the array into 4 parts.
 - Sort the first two parts and then merge them.
 - Sort the next two parts and then merge them.
 And the above steps are repeated recursively with each part 
 until the size of the part to sort is not lesser than the 
 threshold value calculated above.


I think it's worth adding something similar as strategy of std.algorithm.sort.

FWIW, I created a parallel sort in D a while back using std.parallelism. It was part of std.parallel_algorithm, a half-finished project that I abandoned because I was disappointed at how poorly most of it was scaling in practice, probably due to memory bandwidth. If you have some expensive-to-compare types, though, it may be worthwhile. https://github.com/dsimcha/parallel_algorithm/blob/master/parallel_algorithm.d
Apr 21 2013
parent Martin Nowak <code dawg.eu> writes:
On 04/21/2013 04:46 PM, dsimcha wrote:
 2.  Different hardware than I tested on, maybe with better memory
 bandwidth.

Your implementation performs a lot of copying. Maybe an in-place parallel sort algorithm would perform better, e.g. parallel quicksort.
 3.  Expensive comparison functions.  I didn't test this in D either
 because I couldn't think of a good use case.  I tested the D parallel
 sort using small primitive types (ints and floats and stuff).

String sorting is a good use case with slightly higher comparison cost.
Apr 22 2013
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
dsimcha:

 I abandoned because I was disappointed at how poorly most of it 
 was scaling in practice, probably due to memory bandwidth.

Then do you know why the Java version seems to be advantageous (with four cores)? Bye, bearophile
Apr 21 2013
prev sibling next sibling parent "dsimcha" <dsimcha yahoo.com> writes:
On Sunday, 21 April 2013 at 13:30:32 UTC, bearophile wrote:
 dsimcha:

 I abandoned because I was disappointed at how poorly most of 
 it was scaling in practice, probably due to memory bandwidth.

Then do you know why the Java version seems to be advantageous (with four cores)? Bye, bearophile

I don't know Java very well, but possiblities include: 1. Sorting using a virtual or otherwise non-inlined comparison function. This makes the sorting require much more CPU time but not a lot more memory bandwidth. It does beg the question, though, of why the comparison function isn't inlined, especially since modern JITs can sometimes inline virtual functions. 2. Different hardware than I tested on, maybe with better memory bandwidth. 3. Expensive comparison functions. I didn't test this in D either because I couldn't think of a good use case. I tested the D parallel sort using small primitive types (ints and floats and stuff).
Apr 21 2013
prev sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
 Dynamic Arrays: they replace the variable-sized stack-allocated 
 arrays of C99. I'd like something like this in D2 (Issue 9832) 
 with a better integration with the type system.

With some more changes: http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20130415/078426.html Bye, bearophile
Apr 21 2013