digitalmars.D - How accurate is dmd profile? (and do I need GMD/LDC to use gprof?)
- Chris Katko (28/28) Oct 03 2021 Does it break down on multi-threaded scenarios?
- max haughton (6/13) Oct 03 2021 Unless your profiler does call stack sampling (which I don't
- drug (5/21) Oct 03 2021 Both sampling and instrumenting profiling can be unreliable. Sampling
- H. S. Teoh (13/18) Oct 03 2021 [...]
- Chris Katko (3/5) Oct 03 2021 WHAT?! Wouldn't 32/64 bit (architecture native) values be faster
- bauss (2/20) Oct 03 2021 What the... I had no idea but that alone deems it useless to me.
- Imperatorn (2/22) Oct 05 2021 Maybe we can change it
- H. S. Teoh (8/22) Oct 06 2021 [...]
Does it break down on multi-threaded scenarios? I'm running dmd (newest) + Allegro (a C game programming library) with DAllegro (a nice templated binder). My executable is multi-threaded (mostly just helper functions / glue logic from libraries/D/etc), and using OpenGL on 64-bit Linux with a very recent DMD release. The function that dmd's -profile reported was the biggest user of CPU time was a tiny little function that draws a couple background tiles. (<30 at worst case) As opposed to everything else being drawn, tons of opengl primitives, graphical text, text being converted with tons of writelns to console, etc. It didn't make any sense, so I loaded it up with valgrind and kcachegrind and it said, "no, this function takes 0.00 of total time." Should I be using DMD's -profile? Does it have known failure modes? Is this failure mode new to people? Is there any way to get normal profiling with gprof or whatever with DMD, or do I need to compile with LDC and GDC? I'm getting back into D and I recall having both toolchains (LDC and DMD) running. This might have been the reason I kept LDC around and maintained two sets of libraries compiled for both LDC and DMD. Also "-profile" functions used over 7% of all CPU time. Is that the nature of the profiling, or is D using way more than comparable languages/compilers? Lastly, is there any way to d mangle D functions in Valgrind/kcachegrind? Thanks! Have a great weekend!
Oct 03 2021
On Sunday, 3 October 2021 at 08:31:14 UTC, Chris Katko wrote:Does it break down on multi-threaded scenarios? I'm running dmd (newest) + Allegro (a C game programming library) with DAllegro (a nice templated binder). My executable is multi-threaded (mostly just helper functions / glue logic from libraries/D/etc), and using OpenGL on 64-bit Linux with a very recent DMD release. [...]Unless your profiler does call stack sampling (which I don't think gprof or dmd does), don't use it. They're not reliable unless you are doing very targeted profiling. For profiling code, if you're on an Intel, vTune is top dog. Nothing else is as good.
Oct 03 2021
03.10.2021 19:20, max haughton пишет:On Sunday, 3 October 2021 at 08:31:14 UTC, Chris Katko wrote:Both sampling and instrumenting profiling can be unreliable. Sampling profiler results are subject to sampling rate for example. Instrumenting profiler can change timings too much. In fact, sampling and instrumentation complement each other.Does it break down on multi-threaded scenarios? I'm running dmd (newest) + Allegro (a C game programming library) with DAllegro (a nice templated binder). My executable is multi-threaded (mostly just helper functions / glue logic from libraries/D/etc), and using OpenGL on 64-bit Linux with a very recent DMD release. [...]Unless your profiler does call stack sampling (which I don't think gprof or dmd does), don't use it. They're not reliable unless you are doing very targeted profiling. For profiling code, if you're on an Intel, vTune is top dog. Nothing else is as good.
Oct 03 2021
On Sun, Oct 03, 2021 at 08:31:14AM +0000, Chris Katko via Digitalmars-d wrote: [...]The function that dmd's -profile reported was the biggest user of CPU time was a tiny little function that draws a couple background tiles. (<30 at worst case) As opposed to everything else being drawn, tons of opengl primitives, graphical text, text being converted with tons of writelns to console, etc.[...] Be aware that dmd -profile uses *16-bit counters* for tracking function call counts; if your program is CPU-intensive and calls the same function(s) in inner loops more than 65535 times, the counters will wrap around and cause the profile output to be garbled. I ran into this a couple of years ago when trying to profile some CPU-intensive code with some non-trivial testcases, and found dmd -profile output completely unusable because of this limitation. T -- Recently, our IT department hired a bug-fix engineer. He used to work for Volkswagen.
Oct 03 2021
On Sunday, 3 October 2021 at 21:55:29 UTC, H. S. Teoh wrote:Be aware that dmd -profile uses *16-bit counters* for tracking function call counts;WHAT?! Wouldn't 32/64 bit (architecture native) values be faster for memory accesses anyway??
Oct 03 2021
On Sunday, 3 October 2021 at 21:55:29 UTC, H. S. Teoh wrote:On Sun, Oct 03, 2021 at 08:31:14AM +0000, Chris Katko via Digitalmars-d wrote: [...]What the... I had no idea but that alone deems it useless to me.The function that dmd's -profile reported was the biggest user of CPU time was a tiny little function that draws a couple background tiles. (<30 at worst case) As opposed to everything else being drawn, tons of opengl primitives, graphical text, text being converted with tons of writelns to console, etc.[...] Be aware that dmd -profile uses *16-bit counters* for tracking function call counts; if your program is CPU-intensive and calls the same function(s) in inner loops more than 65535 times, the counters will wrap around and cause the profile output to be garbled. I ran into this a couple of years ago when trying to profile some CPU-intensive code with some non-trivial testcases, and found dmd -profile output completely unusable because of this limitation. T
Oct 03 2021
On Monday, 4 October 2021 at 06:10:00 UTC, bauss wrote:On Sunday, 3 October 2021 at 21:55:29 UTC, H. S. Teoh wrote:Maybe we can change itOn Sun, Oct 03, 2021 at 08:31:14AM +0000, Chris Katko via Digitalmars-d wrote: [...]What the... I had no idea but that alone deems it useless to me.[...][...] Be aware that dmd -profile uses *16-bit counters* for tracking function call counts; if your program is CPU-intensive and calls the same function(s) in inner loops more than 65535 times, the counters will wrap around and cause the profile output to be garbled. I ran into this a couple of years ago when trying to profile some CPU-intensive code with some non-trivial testcases, and found dmd -profile output completely unusable because of this limitation. T
Oct 05 2021
On Wed, Oct 06, 2021 at 05:34:54AM +0000, Imperatorn via Digitalmars-d wrote:On Monday, 4 October 2021 at 06:10:00 UTC, bauss wrote:[...]On Sunday, 3 October 2021 at 21:55:29 UTC, H. S. Teoh wrote:[...]Be aware that dmd -profile uses *16-bit counters* for tracking function call counts; if your program is CPU-intensive and calls the same function(s) in inner loops more than 65535 times, the counters will wrap around and cause the profile output to be garbled. I ran into this a couple of years ago when trying to profile some CPU-intensive code with some non-trivial testcases, and found dmd -profile output completely unusable because of this limitation.That would be very nice. I believe the code is somewhere in druntime. It would save a lot of grief in the future. :-) T -- An elephant: A mouse built to government specifications. -- Robert HeinleinWhat the... I had no idea but that alone deems it useless to me.Maybe we can change it
Oct 06 2021