digitalmars.D.bugs - [Issue 15831] New: IFTI voldemort type exploding bloat
- via Digitalmars-d-bugs (145/145) Mar 25 2016 https://issues.dlang.org/show_bug.cgi?id=15831
https://issues.dlang.org/show_bug.cgi?id=15831 Issue ID: 15831 Summary: IFTI voldemort type exploding bloat Product: D Version: D2 Hardware: All OS: All Status: NEW Severity: enhancement Priority: P1 Component: dmd Assignee: nobody puremagic.com Reporter: schveiguy yahoo.com The current name mangling rules for template expansion lead to excessive bloat when voldemort types are returned and used in subsequent chained calls. This pattern is used in some places in Phobos for instance to allow wrapped ranges. An example is std.range.chain. If this pattern is used recursively, the resulting template mangling expands at an exponential rate, because the template parameter is used multiple times in the expansion (once for the type name for the template, and once for the type name of the parameter). Because the return type depends on the function name mangling, the cycle continues as you expand the chain. A simple example: (Please read through the whole bug report, as I have comments in between a lot of output) struct S(T) { void foo(){ throw new Exception("1");} } auto s(T)(T t) { version(bad) { struct Result { void foo(){ throw new Exception("2");} } return Result(); } else return S!T(); } void main(string[] args) { auto x = 1.s.s.s.s.s; x.foo; } If I compile this without -version=bad, I get this: Stevens-MacBook-Pro:testd steves$ time dmd testexpansion.d real 0m0.058s user 0m0.040s sys 0m0.014s Stevens-MacBook-Pro:testd steves$ ls -l testexpansion* -rwxr-xr-x+ 1 steves staff 409948 Mar 25 11:59 testexpansion -rw-r--r--+ 1 steves staff 324 Mar 25 11:59 testexpansion.d -rw-r--r--+ 1 steves staff 7104 Mar 25 11:59 testexpansion.o Stevens-MacBook-Pro:testd steves$ ./testexpansion object.Exception testexpansion.d(3): 1 ---------------- 4 testexpansion 0x00000001050edc14 pure safe void testexpansion.S!(testexpansion.S!(testexpansion.S!(testexpansion.S!(testexpansion.S!(int).S).S).S).S).S.foo() + 144 5 testexpansion 0x00000001050ed837 _Dmain + 119 6 ... # etc. Now, let's compile with the -bad version: Stevens-MacBook-Pro:testd steves$ time dmd -version=bad testexpansion.d real 0m0.058s user 0m0.041s sys 0m0.014s Stevens-MacBook-Pro:testd steves$ ls -l testexpansion* -rwxr-xr-x+ 1 steves staff 429412 Mar 25 12:00 testexpansion -rw-r--r--+ 1 steves staff 324 Mar 25 11:59 testexpansion.d -rw-r--r--+ 1 steves staff 15312 Mar 25 12:00 testexpansion.o Stevens-MacBook-Pro:testd steves$ ./testexpansion object.Exception testexpansion.d(12): 2 ---------------- 4 testexpansion 0x000000010f639c0c pure safe void testexpansion.s!(testexpansion.s!(testexpansion.s!(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).s(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).Result).s(testexpansion.s!(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).s(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).Result).Result).s(testexpansion.s!(testexpansion.s!(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).s(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).Result).s(testexpansion.s!(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).s(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).Result).Result).Result.foo() + 144 5 testexpansion 0x000000010f639807 _Dmain + 119 6 ... # etc. Note the extreme increase in object size. Also note the much larger symbol name when the exception is thrown. Let's expand the number of s calls to 10: auto x = 1.s.s.s.s.s.s.s.s.s.s; New results: Stevens-MacBook-Pro:testd steves$ time dmd testexpansion.d real 0m0.074s user 0m0.043s sys 0m0.018s Stevens-MacBook-Pro:testd steves$ ls -l testexpansion* -rwxr-xr-x+ 1 steves staff 431716 Mar 25 14:01 testexpansion -rw-r--r--+ 1 steves staff 334 Mar 25 14:01 testexpansion.d -rw-r--r--+ 1 steves staff 16528 Mar 25 14:01 testexpansion.o OK, so the size of the object doubled for doubling the number of calls/types. Not unexpected. Stevens-MacBook-Pro:testd steves$ ./testexpansion object.Exception testexpansion.d(3): 1 ---------------- 4 testexpansion 0x0000000104816bec pure safe void testexpansion.S!(testexpansion.S!(testexpansion.S!(testexpansion.S!(testexpansion.S!(testexpansion.S!(testexpansion.S!(testexpansion.S!(testexpansion.S!(testexpansion.S!(int).S).S).S).S).S).S).S).S).S).S.foo() + 144 5 testexpansion 0x00000001048164c6 _Dmain + 214 6 ... # etc. A reasonable expanding of the symbol name. Let's try the bad version: Stevens-MacBook-Pro:testd steves$ time dmd -version=bad testexpansion.d real 0m0.164s user 0m0.145s sys 0m0.016s Comparing to original compilation time of testexpansion with bad version (0.058s), this is a significant slowdown. My tests show me that the major slowdown is calling the linker. Stevens-MacBook-Pro:testd steves$ ls -l testexpansion* -rwxr-xr-x+ 1 steves staff 1308740 Mar 25 14:01 testexpansion -rw-r--r--+ 1 steves staff 334 Mar 25 14:01 testexpansion.d -rw-r--r--+ 1 steves staff 382168 Mar 25 14:01 testexpansion.o Here the object has drastically blown up from 15k to 382k. I'm timing the execution, because this is significant as well: Stevens-MacBook-Pro:testd steves$ time ./testexpansion object.Exception testexpansion.d(12): 2 ---------------- 4 testexpansion 0x0000000109b83bec pure safe void testexpansion.s!(testexpansion.s!(testexpansion.s!(testexpansion.s!(testexpansion.s!(testexpansion.s!(testexpansion.s!(testexpansion.s!(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).s(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).Result).s(testexpansion.s!(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).s(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).Result).Result).s(testexpansion.s!(testexpansion.s!(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).s(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).Result).s(testexpansion.s!(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).s(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).Result).Result).Result).s(testexpansion.s!(testexpansion.s!(testexpansion.s!(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).s(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).Result).s(testexpansion.s!(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).s(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).Result).Result).s(testexpansion.s!(testexpansion.s!(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).s(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).Result).s(testexpansion.s!(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).s(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).Result).Result).Result).Result).s(testexpansion.s!(testexpansion.s!(testexpansion.s!(testexpansion.s!(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).s(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).Result).s(testexpansion.s!(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).s(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).Result).Result).s(testexpansion.s!(testexpansion.s!(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).s(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).Result).s(testexpansion.s!(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).s(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).Result).Result).Result).s(testexpansion.s!(testexpansion.s!(testexpansion.s!(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).s(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).Result).s(testexpansion.s!(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).s(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).Result).Result).s(testexpansion.s!(testexpansion.s!(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).s(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).Result).s(testexpansion.s!(testexpansion.s!(testexpansion.s!(int).s(int).Result).s(testexpansion.s!(int).s(int).Result).Result).s(testexpansion.s!(testexpansion.s!(int).s(i 5 testexpansion 0x0000000109b83476 _Dmain + 214 6 ... # etc. real 0m2.801s user 0m2.790s sys 0m0.004s That's right, it takes 2.8 seconds to decide to print not quite the entire symbol name, before giving up. So this affects runtime issues as well. I tried adding 5 more calls to .s, and the object file climbed to 12MB, along with the compilation taking at least 1.5 minutes before I killed it. This is not a real-world example, but I have encountered real world examples with my iopipe library. Originally I made all the wrapping types voldemort types. Compiling a test program was at about 10MB and showed similar slowdowns when an exception was thrown. Moving the voldemort types outside the functions reduced the size to <1MB, and fixed the exception problems. I think if Phobos used more voldemort types and less external structs (there actually aren't that many voldemort ones as early range/algorithm code didn't have the luxury), I think we would have run into this sooner. We need some way to elide all the repeated type names, either via compression or substitution, or something. If we can factor out the multiplier for the symbol mangling, we can fix the problem. --
Mar 25 2016