digitalmars.D - Importance of memory organization for speed
- Bill Cox (6/6) Jun 09 2008 Hi, all.
- renoX (7/28) Jun 13 2008 Uh? What you just did is using your knowledge of the memory layout in C
- Nick B (6/39) Jun 14 2008 Hi there
- Russell Lewis (15/34) Jun 14 2008 In a perfect world, a compiler can perform deep optimizations, similar
Hi, all. Waaay back, there was a short discussion of optimizing memory layout for speed. I've written a simple benchmark that traverses large graphs, one written in very carefully memory optimized C, the other using C++/STL. The C version is 15X faster, and uses 2X less memory on my Ubuntu x64 Core Duo laptop. Cachegrind shows the C version has a 16.7X lower L2 cache miss rate, which accounts for the speed difference. So, I'll just post again the importance of keeping memory layout abstract, and hidden from the user. More and more, speed for memory intensive applications is all about cache performance. Benchmarks can be found in the examples/graph_benchmark directory of svn for the datadraw project: svn co https://datadraw.svn.sourceforge.net/svnroot/datadraw/trunk datadraw Best regards, Bill
Jun 09 2008
Bill Cox a écrit :Hi, all. Waaay back, there was a short discussion of optimizing memory layout for speed. I've written a simple benchmark that traverses large graphs, one written in very carefully memory optimized C, the other using C++/STL. The C version is 15X faster, and uses 2X less memory on my Ubuntu x64 Core Duo laptop. Cachegrind shows the C version has a 16.7X lower L2 cache miss rate, which accounts for the speed difference. So, I'll just post again the importance of keeping memory layout abstract, and hidden from the user.Uh? What you just did is using your knowledge of the memory layout in C to speedup your app, so it's the *opposite* of having the memory layout hidden from the user! I don't catch your point here.. Regards, renoXMore and more, speed for memory intensive applications is all about cache performance. Benchmarks can be found in the examples/graph_benchmark directory of svn for the datadraw project: svn co https://datadraw.svn.sourceforge.net/svnroot/datadraw/trunk datadraw Best regards, Bill
Jun 13 2008
renoX wrote:Bill Cox a écrit :Hi there Does any one know how to measure the L1 & L2 cache performance using D & Tango or is the _only_ way to do this is to use Valgrind ? regards Nick BHi, all. Waaay back, there was a short discussion of optimizing memory layout for speed. I've written a simple benchmark that traverses large graphs, one written in very carefully memory optimized C, the other using C++/STL. The C version is 15X faster, and uses 2X less memory on my Ubuntu x64 Core Duo laptop. Cachegrind shows the C version has a 16.7X lower L2 cache miss rate, which accounts for the speed difference. So, I'll just post again the importance of keeping memory layout abstract, and hidden from the user.Uh? What you just did is using your knowledge of the memory layout in C to speedup your app, so it's the *opposite* of having the memory layout hidden from the user! I don't catch your point here.. Regards, renoXMore and more, speed for memory intensive applications is all about cache performance. Benchmarks can be found in the examples/graph_benchmark directory of svn for the datadraw project: svn co https://datadraw.svn.sourceforge.net/svnroot/datadraw/trunk datadraw Best regards, Bill
Jun 14 2008
renoX wrote:Bill Cox a écrit :In a perfect world, a compiler can perform deep optimizations, similar to hand-tuning your program. But it can't do it if you have already halfway specified the memory layout. So in that perfect world, you want to actually *underspecify* your program, so that the compiler can work miracles. However, if you compiler isn't as good as that, then hand-tuning is the better option. An interesting observation is that for straight-line code (constrained within a single function), it used to be that hand-tuned C (or, better yet, assembler) would be much faster than what any compiler could produce. Nowadays, compilers generally produce code that is as good (if not better) than assembly experts. I would suspect that 20 years from now, our compilers will rework the memory layout just like they currently rework the ordering of operations in our functions. But I don't think that we're there yet.Hi, all. Waaay back, there was a short discussion of optimizing memory layout for speed. I've written a simple benchmark that traverses large graphs, one written in very carefully memory optimized C, the other using C++/STL. The C version is 15X faster, and uses 2X less memory on my Ubuntu x64 Core Duo laptop. Cachegrind shows the C version has a 16.7X lower L2 cache miss rate, which accounts for the speed difference. So, I'll just post again the importance of keeping memory layout abstract, and hidden from the user.Uh? What you just did is using your knowledge of the memory layout in C to speedup your app, so it's the *opposite* of having the memory layout hidden from the user! I don't catch your point here..
Jun 14 2008