www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Faster Dlang Execution

reply seany <seany uni-bonn.de> writes:
Hello

How can I increase the speed of executable files created via :

`dub build -b release`

I am unable to parallellise all of it, as it depends on part of 
the result being calculated before something else can be 
calculated.

I have many `nonsafe, nonpure` functions. Classes are virtual by 
defalut. Profiling doesn't help, because different input is 
causing different parts of the program to become slow.

Thank you.
Jun 08 2021
next sibling parent mw <mingwu gmail.com> writes:
On Tuesday, 8 June 2021 at 17:10:47 UTC, seany wrote:
 Hello

 How can I increase the speed of executable files created via :

 `dub build -b release`

 I am unable to parallellise all of it, as it depends on part of 
 the result being calculated before something else can be 
 calculated.
You need to write parallel code yourself, the compiler won't know your app's logic. Here is how the Dlang std lib can help: https://tour.dlang.org/tour/en/multithreading/std-parallelism
Jun 08 2021
prev sibling next sibling parent reply Basile B. <b2.temp gmx.com> writes:
On Tuesday, 8 June 2021 at 17:10:47 UTC, seany wrote:
 Hello

 How can I increase the speed of executable files created via :

 `dub build -b release`
try `dub build -b release --compiler=ldc2` Then you can set some specific DFlags for ldc, like -O3 or --mcpu
 I am unable to parallellise all of it, as it depends on part of 
 the result being calculated before something else can be 
 calculated.

 I have many `nonsafe, nonpure` functions.
`nothrow` presumably opens optimisation opportunities with the stack management, although it's not verified
Classes are virtual  by defalut.
set them final when possible. When not possible set the virtual methods that are not overridden `final`.
 Profiling doesn't help, because different input is causing 
 different parts of the program to become slow.
if you're on linux, you can try callgrind + kcachegrind instead of builtin intrumentation.
 Thank you.
Jun 08 2021
parent Jack <jckj33 gmail.com> writes:
On Tuesday, 8 June 2021 at 17:40:19 UTC, Basile B. wrote:
 On Tuesday, 8 June 2021 at 17:10:47 UTC, seany wrote:
 Hello

 How can I increase the speed of executable files created via :

 `dub build -b release`
try `dub build -b release --compiler=ldc2` Then you can set some specific DFlags for ldc, like -O3 or --mcpu
also there's the --parallel flag itself supported by ldc2, I have quite a while ago but I'm pretty sure it still is there
Jun 08 2021
prev sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Jun 08, 2021 at 05:10:47PM +0000, seany via Digitalmars-d-learn wrote:
[...]
 Profiling doesn't help, because different input is causing different
 parts of the program to become slow.
[...] Do you have any more specific information about what kind of inputs cause which parts of the program to slow down? Without more details it's hard to say what the problem is. But I'd say, if you care about performance you should fix *all* of the slow parts that your profiler finds. There are some performance best practices that you should follow, such as reduce frequent GC allocations, avoid expensive algorithms (like O(n^2) or worse) where possible, avoid excessive copying of data, perform I/O in larger blocks instead of small bits at a time, avoid excessive indirection (final methods where they don't need to be virtual, by-value types instead of deep dereferencing, etc.), cache frequently-computed results, etc.. But more importantly, if you can elaborate a bit more on what your program is trying to do, it would help us give more specific recommendations. There may be domain-specific optimizations that you could apply as well. T -- MS Windows: 64-bit rehash of 32-bit extensions and a graphical shell for a 16-bit patch to an 8-bit operating system originally coded for a 4-bit microprocessor, written by a 2-bit company that can't stand 1-bit of competition.
Jun 08 2021
parent reply seany <seany uni-bonn.de> writes:
On Tuesday, 8 June 2021 at 18:03:32 UTC, H. S. Teoh wrote:

 But more importantly, if you can elaborate a bit more on what 
 your program is trying to do, it would help us give more 
 specific recommendations. There may be domain-specific 
 optimizations that you could apply as well.


 T
Hi The program is trying to categorize GPS tracks. It has to identify track that count as (somewhat) parallel (this is difficult to define) . So I draw lines through points that have at most 5 m (as measured by vicenty -'s formula) RMS error from the trend line. Then i look for lines that can be considered "turn lines" ( a turn joining two parallel lines). Then I draw a best fit boundary around it. I lay a square grid, and remove the squares where no line can be found. Then I use this algorithm : https://stackoverflow.com/questions/50885339/polygon-from-a-grid-of-squares This runs at O(N²) for sure. Does this help?
Jun 08 2021
parent mw <mingwu gmail.com> writes:
On Tuesday, 8 June 2021 at 22:04:26 UTC, seany wrote:
 The program is trying to categorize GPS tracks.
 It has to identify track that count as (somewhat) parallel 
 (this is difficult to define) .
Maybe this is what you looking for: https://en.wikipedia.org/wiki/Dynamic_time_warping and you can run on GPU with this: https://github.com/garrettwrong/cuTWED
Jun 08 2021