digitalmars.D.learn - dmd memory usage/static lib/algorithm bug?
- Marek Janukowicz (44/44) Aug 28 2013 This is really a cross-domain issue, but I didn't feel like splitting it...
- H. S. Teoh (63/110) Aug 28 2013 Possible causes that I can think of are:
- Marek Janukowicz (44/44) Aug 28 2013 I was finally able to create simple test case that probably reproduces t...
- H. S. Teoh (11/32) Aug 28 2013 It doesn't seem to happen on git HEAD. I'm going to try 2.063.2 and see
- Marek Janukowicz (4/26) Aug 28 2013 64-bit
- H. S. Teoh (9/14) Aug 28 2013 [...]
- Marek Janukowicz (30/42) Aug 28 2013 Yeah, it makes a difference :)
- H. S. Teoh (23/42) Aug 29 2013 [...]
- H. S. Teoh (12/31) Aug 28 2013 [...]
This is really a cross-domain issue, but I didn't feel like splitting it into separate posts would make sense. I use DMD 2.063.2 on Linux 64-bit. I have some code in my (non-trivial) application that basically corresponds to this: import std.stdio, std.algorithm, std.array; void main () { struct Val { int i; } Val[] arr; arr ~= Val( 3 ); arr ~= Val( 1 ); arr ~= Val( 2 ); auto sorter = (Val a, Val b) { return a.i < b.i; }; writefln( "sorted: %s", arr.sort!(sorter)); } While this simple example works, I'm getting segfaults with corresponding code in thisi bigger project. Those segfaults can be traced down to algorithm.d line 8315 or another line (8358?) that use this "sorter" lambda I passed to "sort!" - suggesting it is a bad memory reference. I tried to create a simple test case that would fail similarly, but to no avail. I can't make the code for my whole project available, so let's just say it's either some bug in DMD or something caused by my limited knowledge of D. Now the funny things begin: I copied algorithm.d to my project in an attempt to make some modifications to it (hopefully to fix the problem or at least get some insight into its nature), but things miraculously started working! This leads me to the suspicion there is something wrong with libphobos2.a file provided with DMD tarball. Next problem in the line is that compilation of my project with algorithm.d included takes almost 4GB of RAM. While I'm aware of the fact DMD deliberately doesn't free the memory for performance purposes, this makes the compilation fail due to insufficient memory on machines with 4GB RAM (and some taken). So my questions are: - how can I limit DMD memory usage? - how can I include a static library with my project? I can compile algorithm.d to a static lib, but how do I include this one explicitly with my project while the rest of Phobos should be taken from the stock libphobos2.a ? - any other ideas how to solve my problems on any level? -- Marek Janukowicz
Aug 28 2013
On Wed, Aug 28, 2013 at 05:29:40PM +0200, Marek Janukowicz wrote:This is really a cross-domain issue, but I didn't feel like splitting it into separate posts would make sense. I use DMD 2.063.2 on Linux 64-bit. I have some code in my (non-trivial) application that basically corresponds to this: import std.stdio, std.algorithm, std.array; void main () { struct Val { int i; } Val[] arr; arr ~= Val( 3 ); arr ~= Val( 1 ); arr ~= Val( 2 ); auto sorter = (Val a, Val b) { return a.i < b.i; }; writefln( "sorted: %s", arr.sort!(sorter)); } While this simple example works, I'm getting segfaults with corresponding code in thisi bigger project. Those segfaults can be traced down to algorithm.d line 8315 or another line (8358?) that use this "sorter" lambda I passed to "sort!" - suggesting it is a bad memory reference.Possible causes that I can think of are: 1) You have a struct somewhere and your lambda closes over it (or one of its members), but later on you return this struct to another scope and invoke the lambda. But since structs can get moved around when passed between different functions, the lambda's context pointer is invalid and so it crashes. On simple programs this problem may be hidden because you don't use the stack as much, so the old copy of the struct may still be intact even though that part of the stack is technically no longer valid, so the lambda may still *appear* to work. 2) There is a compiler bug that generates wrong code for a lambda. There have been some such bugs before where it fails to recognize a lambda, or fails to notice that a local variable is closed over, so it doesn't move the local variable to the heap but leaves it on the stack, where it gets invalidated afterwards, causing the lambda to crash.I tried to create a simple test case that would fail similarly, but to no avail. I can't make the code for my whole project available, so let's just say it's either some bug in DMD or something caused by my limited knowledge of D.If you have a reliable way of reproducing the problem and can encapsulate it into a shell script / batch file, you can use Vladimir Panteleev's DustMite to automatically reduce your code to the minimum for reproducing the bug. See: https://github.com/D-Programming-Language/tools/tree/master/DustMiteNow the funny things begin: I copied algorithm.d to my project in an attempt to make some modifications to it (hopefully to fix the problem or at least get some insight into its nature), but things miraculously started working! This leads me to the suspicion there is something wrong with libphobos2.a file provided with DMD tarball.This is another possibility. :) Did you check whether DMD is linking the correct version of libphobos2.a into your program? Sometimes strange things can happen when you have stale copies of older versions of libphobos2.a lying around your system, and DMD accidentally picks those up instead of the correct version. But it could also be, that this is merely masking the problem. Invalid pointers are notorious for causing heisenbugs that appear/disappear when you move unrelated code around.Next problem in the line is that compilation of my project with algorithm.d included takes almost 4GB of RAM. While I'm aware of the fact DMD deliberately doesn't free the memory for performance purposes, this makes the compilation fail due to insufficient memory on machines with 4GB RAM (and some taken).You could try splitting it up. :) Well, we're planning to split it up at some point, now that DMD supports package.d. Here's roughly how you might do it: - Temporarily rename std/algorithm.d into another file. - Create a directory called std/algorithm/ - Create the file std/algorithm/package.d containing something like this: module std.algorithm; public import std.algorithm.search; public import std.algorithm.sort; public import std.algorithm.set; public import std.algorithm.mutation; ... - Split up algorithm.d into the above parts (std/algorithm/search.d, std/algorithm/sort.d, ... etc.). You probably don't really need to follow exactly the above division; any partitioning of std.algorithm into mutually-independent parts will do. Probably splitting into just two parts will to cut down memory usage enough to make it compilable on your system. Or, if this is too much work for your purposes, you could make a copy of std.algorithm, rename it to my.algorithm, say, update all your imports accordingly, and then just edit the file and delete the parts you don't use. For example, if you don't use any of the set functions (merge, cartesianProduct, etc.), just delete them from the file along with all their unittests. This should reduce the amount of memory needed to compile it.So my questions are: - how can I limit DMD memory usage?I'll let others answer, since I'm not that familiar with DMD source code myself.- how can I include a static library with my project? I can compile algorithm.d to a static lib, but how do I include this one explicitly with my project while the rest of Phobos should be taken from the stock libphobos2.a ?This could be tricky. One way to do it, is to rename std.algorithm to my.algorithm (as mentioned above), compile that into myalgo.a, and then do something like: dmd program.d ... -ofprogram -L-lalgo.a -L-L. Hope this helps. T -- Just because you survived after you did it, doesn't mean it wasn't stupid!
Aug 28 2013
I was finally able to create simple test case that probably reproduces the bug (probably, because the stack trace is completely different, but the code that is there is similar). This requires 2 source code files: main.d: module main; // This line must be there - import any module from std causes is necessary to // reproduce the bug import std.stdio; import sorter; void main () { } ------------------- sorter.d: module sorter; import std.algorithm; struct Val { int i; } unittest { Val [] arr; arr ~= Val( 2 ); arr ~= Val( 1 ); // This works arr.sort!((Val a, Val b) { return a.i < b.i; }); // This segfaults when sorting auto dg = (Val a, Val b) { return a.i < b.i; }; arr.sort!(dg); } ------------------ Run with: dmd -unittest main.d sorter.d && ./main For me this results in a segfault. Changing one of many seemingly unrelated details (eg. moving offending code directly to main, commenting out std.stdio import in main.d) makes the problem disappear. Can anyone try to reproduce that? Again, I'm on DMD 2.063.2. H. S. Teoh - thanks for your detailed description, but this test case probably sheds some more light and invalidates some of your hyphotheses. As for building static library - I thought it would be easier, so if the bug remains unresolved I'll probably just rebuild the whole phobos. The problem is definitely not some old version of libphobos2.a stuck around, because the problem could be reproduced exactly the same way on 3 machines I tried. -- Marek Janukowicz
Aug 28 2013
On Wed, Aug 28, 2013 at 11:02:17PM +0200, Marek Janukowicz wrote:I was finally able to create simple test case that probably reproduces the bug (probably, because the stack trace is completely different, but the code that is there is similar). This requires 2 source code files:[...]Run with: dmd -unittest main.d sorter.d && ./main For me this results in a segfault. Changing one of many seemingly unrelated details (eg. moving offending code directly to main, commenting out std.stdio import in main.d) makes the problem disappear. Can anyone try to reproduce that? Again, I'm on DMD 2.063.2.It doesn't seem to happen on git HEAD. I'm going to try 2.063.2 and see what happens. Oh, and BTW, are you on Linux 32-bit or 64-bit? Don't know if that makes a difference, but just in case.H. S. Teoh - thanks for your detailed description, but this test case probably sheds some more light and invalidates some of your hyphotheses. As for building static library - I thought it would be easier, so if the bug remains unresolved I'll probably just rebuild the whole phobos. The problem is definitely not some old version of libphobos2.a stuck around, because the problem could be reproduced exactly the same way on 3 machines I tried.[...] I'll give it a try on 2.063.2 to see if I can reproduce the problem. T -- Ruby is essentially Perl minus Wall.
Aug 28 2013
H. S. Teoh wrote:On Wed, Aug 28, 2013 at 11:02:17PM +0200, Marek Janukowicz wrote:64-bit -- Marek JanukowiczI was finally able to create simple test case that probably reproduces the bug (probably, because the stack trace is completely different, but the code that is there is similar). This requires 2 source code files:[...]Run with: dmd -unittest main.d sorter.d && ./main For me this results in a segfault. Changing one of many seemingly unrelated details (eg. moving offending code directly to main, commenting out std.stdio import in main.d) makes the problem disappear. Can anyone try to reproduce that? Again, I'm on DMD 2.063.2.It doesn't seem to happen on git HEAD. I'm going to try 2.063.2 and see what happens. Oh, and BTW, are you on Linux 32-bit or 64-bit? Don't know if that makes a difference, but just in case.
Aug 28 2013
On Thu, Aug 29, 2013 at 12:45:05AM +0200, Marek Janukowicz wrote:H. S. Teoh wrote:[...][...] Maybe try compiling with -m32 and see if it makes a difference? If so, it may be a 64-bit related dmd bug. I'm also having trouble building a working compiler toolchain with a purely 64-bit environment. T -- Fact is stranger than fiction.Oh, and BTW, are you on Linux 32-bit or 64-bit? Don't know if that makes a difference, but just in case.64-bit
Aug 28 2013
H. S. Teoh wrote:On Thu, Aug 29, 2013 at 12:45:05AM +0200, Marek Janukowicz wrote:Yeah, it makes a difference :) ./main(_D4core7runtime18runModuleUnitTestsUZb19unittestSegvHandlerUiPS4core3sys5posix6signal9siginfo_tPvZv+0x2c) [0x80a8c64] linux-gate.so.1(__kernel_rt_sigreturn+0x0)[0xffffe410] ./main(_D6sorter14__unittestL9_3FZv101__T13quickSortImplS65_D6sorter14__unittestL9_3FZv2dgPFNaNbNfS6sorter3ValS6sorter3ValZbTAS6sorter3ValZ13quickSortImplMFAS6sorter3ValZv+0x1a7) [0x80a1b53] ./main(_D6sorter14__unittestL9_3FZv122__T4sortS65_D6sorter14__unittestL9_3FZv2dgPFNaNbNfS6sorter3ValS6sorter3ValZbVE3std9algorithm12SwapStrategy0TAS6sorter3ValZ4sortMFAS6sorter3ValZS6sorter14__unittestL9_3FZv99__T11SortedRangeTAS6sorter3ValS65_D6sorter14__unittestL9_3FZv2dgPFNaNbNfS6sorter3ValS6sorter3ValZbZ11SortedRange+0x17) [0x80a20cb] ./main(_D6sorter14__unittestL9_3FZv+0x6d)[0x80a2079] ./main(_D6sorter9__modtestFZv+0x8)[0x80a21ac] ./main(_D4core7runtime18runModuleUnitTestsUZb16__foreachbody352MFKPS6object10ModuleInfoZi+0x24) [0x80a8ccc] ./main(_D2rt5minfo17moduleinfos_applyFMDFKPS6object10ModuleInfoZiZi16__foreachbody541MFKS2rt14sections_linux3DSOZi+0x37) [0x80a5f47] ./main(_D2rt14sections_linux3DSO7opApplyFMDFKS2rt14sections_linux3DSOZiZi+0x2c) [0x80a619c] ./main(_D2rt5minfo17moduleinfos_applyFMDFKPS6object10ModuleInfoZiZi+0x14) [0x80a5ef4] ./main(runModuleUnitTests+0x87)[0x80a8bd7] ./main(_D2rt6dmain211_d_run_mainUiPPaPUAAaZiZi6runAllMFZv+0x25)[0x80a4a55] ./main(_D2rt6dmain211_d_run_mainUiPPaPUAAaZiZi7tryExecMFMDFZvZv+0x18) [0x80a46c0] ./main(_d_run_main+0x121)[0x80a4691] ./main(main+0x14)[0x80a4564] /lib32/libc.so.6(__libc_start_main+0xf3)[0xf74d4943] Segmentation fault (core dumped) This stacktrace did not show in 64-bit version, but the problem persists. -- Marek JanukowiczH. S. Teoh wrote:[...][...] Maybe try compiling with -m32 and see if it makes a difference? If so, it may be a 64-bit related dmd bug. I'm also having trouble building a working compiler toolchain with a purely 64-bit environment.Oh, and BTW, are you on Linux 32-bit or 64-bit? Don't know if that makes a difference, but just in case.64-bit
Aug 28 2013
On Thu, Aug 29, 2013 at 01:46:03AM +0200, Marek Janukowicz wrote:H. S. Teoh wrote:[...]On Thu, Aug 29, 2013 at 12:45:05AM +0200, Marek Janukowicz wrote:Yeah, it makes a difference :)H. S. Teoh wrote:[...][...] Maybe try compiling with -m32 and see if it makes a difference? If so, it may be a 64-bit related dmd bug. I'm also having trouble building a working compiler toolchain with a purely 64-bit environment.Oh, and BTW, are you on Linux 32-bit or 64-bit? Don't know if that makes a difference, but just in case.64-bitThis stacktrace did not show in 64-bit version, but the problem persists.[...] OK, my trouble with compiling 2.063.2 in 64-bit was actually my own fault -- I had a faulty dmd.conf -- so actually that has nothing to do with your problem. Anyway, I don't think it's a problem with libphobos2.a, because tracing through your failing test case, I see that it's all template functions instantiated from std.algorithm, and no static functions from the library are called at the point of failure. The template code in Phobos also looks kosher, so right now I'm suspecting a DMD codegen bug. I'm going to investigate the disassembly now to figure out what's going on. The good news is that the upcoming dmd 2.064 doesn't appear to have this problem: the part of std.algorithm that concerns your code doesn't appear to have been touched since 2.063.2, so the problem is unlikely to be there. So whatever dmd bug is causing the problem has been fixed in git HEAD. The bad news is that dmd 2.064 has some changes that makes it incompatible with 2.063.2 druntime/phobos, so you won't be able to use it unless you build the entire dmd/druntime/phobos toolchain from git. T -- "No, John. I want formats that are actually useful, rather than over-featured megaliths that address all questions by piling on ridiculous internal links in forms which are hideously over-complex." -- Simon St. Laurent on xml-dev
Aug 29 2013
On Wed, Aug 28, 2013 at 02:10:34PM -0700, H. S. Teoh wrote:On Wed, Aug 28, 2013 at 11:02:17PM +0200, Marek Janukowicz wrote:[...] Update: I've reproduced this problem on 2.063.2, but while trying to track down the problem, I discovered that building 2.063.2 from source doesn't produce a working toolchain. :-( I've spent an hour trying to figure out what's wrong with 2.063.2, but with no success. So, you *could* be right that something may be wrong with libphobos2.a. I'll try to download the tarball from dlang.org again and see if that helps, or if I've messed up my system somehow. :-P T -- There are two ways to write error-free programs; only the third one works.I was finally able to create simple test case that probably reproduces the bug (probably, because the stack trace is completely different, but the code that is there is similar). This requires 2 source code files:[...]Run with: dmd -unittest main.d sorter.d && ./main For me this results in a segfault. Changing one of many seemingly unrelated details (eg. moving offending code directly to main, commenting out std.stdio import in main.d) makes the problem disappear. Can anyone try to reproduce that? Again, I'm on DMD 2.063.2.It doesn't seem to happen on git HEAD. I'm going to try 2.063.2 and see what happens.
Aug 28 2013