www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - A few numbers on allocation in dmd

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
I got a few numbers on what types dmd allocates the most while compiling 
a large project. The project is decently large (takes minutes to build) 
and uses a lot of compile-time stuff, but I'd think that'd be not 
atypical for a D program because people who wouldn't need that kind of 
stuff wouldn't derive much advantage from using D in the first place. So 
it seems a representative corpus.

The "-profile=gc" and "-lowmem" flags should help with all that, and 
they do work (thanks to all who hopped on 
https://issues.dlang.org/show_bug.cgi?id=20960 and helped). However, the 
inclusion of that profiling makes the compiler unbearably slow even for 
moderately-sized programs, so I resorted to a low-tech solution by means 
of inserting:

printf("%s\n", ci.info.name.ptr);

at 
https://github.com/dlang/dmd/blob/b6b0c0f41a476c4eaa88ba106fb4de1175d40440/src/
md/root/rmem.d#L240 
(thanks  kinke for pointing me there).

That produces a hecatomb of output - we're talking tens of gigs for a 
large project. There are a lot of duplicates because there aren't that 
many distinct types, so the normal solution would be the classic:

dub build | sort | uniq -c | sort -nr >sorted.log

Problem being, of course, the temporary files would be huge and the 
extra time spent would be just crazy. A hashtable turned out to fix that 
problem:

dub build | awk '{ a[$0]++ }END{ for(i in a) print a[i],i }' \
     | sort -nr >sorted.log

(did I mention low-tech?) So at the end of all this I got the attached 
file containing the most allocated types by dmd while compiling. (It's 
number of allocations i.e. calls to new, not size allocated; collecting 
total bytes allocated would bring additional, but different, information).

Looking at the top offenders:

42634177 dmd.mtype.TypeIdentifier
20452075 dmd.dtemplate.TemplateInstance
20202329 dmd.dsymbol.DsymbolTable
18783004 dmd.declaration.AliasDeclaration
18224199 dmd.dsymbol.ScopeDsymbol
18172133 dmd.mtype.Parameter
14124126 dmd.expression.IntegerExp



likely to greatly relieve the number of allocation calls. Here are a few 
thoughts on possible improvements:

* Any work that reduces the number of TypeIdentifier, TemplateInstance, 
etc. objects in the first place would help quite a bit.

* An object pooling approach may be helpful: have a pool broker all 
allocations of TypeIdentifier objects. If/when a TypeIdentifier object 
is no longer used, return it to the pool instead of deallocating. Next 
allocation request will retrieve it from the pool. That way the code is 
still safe even if by mistake the freed object continues to be used. 
(Bugs can be diagnosed by disabling pool allocation and testing again.)

* Interning: if many TypeIdentifier objects have the same content, it 
may be worthwhile tracking that and have the same reference shared from 
many places. Things like immutable and const can be of great help here.

* Layout: any improvement in the layout of TypeIdentifier (e.g. make it 
smaller and a multiple of cache line size) is likely to have large 
impact on fragmentation.

I'll look into also adding information on bytes allocated tomorrow.
Jun 29
next sibling parent reply rikki cattermole <rikki cattermole.co.nz> writes:
On 30/06/2020 1:18 PM, Andrei Alexandrescu wrote:
 * Any work that reduces the number of TypeIdentifier, TemplateInstance, 
 etc. objects in the first place would help quite a bit.
One strategy that is used with identifiers in dmd is to use a table lookup of strings to them. By making them unique, it makes them faster overall. If we could strip out the SLOC out of the TypeIdentifier class, this strategy could be used for it.
Jun 29
next sibling parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Tuesday, 30 June 2020 at 01:54:54 UTC, rikki cattermole wrote:
 On 30/06/2020 1:18 PM, Andrei Alexandrescu wrote:
 * Any work that reduces the number of TypeIdentifier, 
 TemplateInstance, etc. objects in the first place would help 
 quite a bit.
One strategy that is used with identifiers in dmd is to use a table lookup of strings to them. By making them unique, it makes them faster overall. If we could strip out the SLOC out of the TypeIdentifier class, this strategy could be used for it.
I would guess TypeIdentifiers being excessively is one of the side effects of massive template instantiation. Templates are a crutch when used for meta programming. It's time we start walking on two strong legs.
Jun 29
parent reply Simen =?UTF-8?B?S2rDpnLDpXM=?= <simen.kjaras gmail.com> writes:
On Tuesday, 30 June 2020 at 01:58:42 UTC, Stefan Koch wrote:
 On Tuesday, 30 June 2020 at 01:54:54 UTC, rikki cattermole 
 wrote:
 On 30/06/2020 1:18 PM, Andrei Alexandrescu wrote:
 * Any work that reduces the number of TypeIdentifier, 
 TemplateInstance, etc. objects in the first place would help 
 quite a bit.
One strategy that is used with identifiers in dmd is to use a table lookup of strings to them. By making them unique, it makes them faster overall. If we could strip out the SLOC out of the TypeIdentifier class, this strategy could be used for it.
I would guess TypeIdentifiers being excessively is one of the side effects of massive template instantiation. Templates are a crutch when used for meta programming. It's time we start walking on two strong legs.
Looking at the numbers, it's interesting how closely the numbers for TemplateInstance, AliasDeclaration and ScopeDsymbol match, with TypeIdentifier being very close to 2x that. Could it be there's some template(s) on the form template Foo(T...) { alias Foo = Foo!(...) }? :p -- Simen
Jun 30
parent reply NilsLankila <NilsLankila gmx.us> writes:
On Wednesday, 1 July 2020 at 06:16:42 UTC, Simen Kjærås wrote:
 Could it be there's some template(s) on the form template 
 Foo(T...) { alias Foo = Foo!(...) }? :p

 --
   Simen
There's an alias declaration for each template parameter. It is created so that one can refer to the template paramter in the scope matching to the template body. There's already a micro optim for AliasDeclaration, see https://github.com/dlang/dmd/pull/11354. It should cut the use by 25% (so the number for AliasDecl *only*), minus the problem of allocations blocks, so maybe more 10% to 20% IRL.
Jul 01
parent reply kinke <noone nowhere.com> writes:
On Wednesday, 1 July 2020 at 08:15:12 UTC, NilsLankila wrote:
 There's already a micro optim for AliasDeclaration, see 
 https://github.com/dlang/dmd/pull/11354. It should cut the use 
 by 25% (so the number for AliasDecl *only*), minus the problem 
 of allocations blocks, so maybe more 10% to 20% IRL.
For 64-bit, the size of a Declaration instance is 200 bytes (2.092 frontend), for AliasDeclaration 232 bytes, and with your change, down to 224 bytes, so the correct number is more like -3.5%. Thx for doing it anyway. Another thing to keep in mind is that all new'd allocations are padded to a multiple of 16-bytes, see https://github.com/dlang/dmd/blob/d97a908d35c8e6c22571688f12138862ef089337/src/ md/root/rmem.d#L176 (for the GC/-lowmem too, which has a bigger overhead).
Jul 01
parent NilsLankila <NilsLankila gmx.us> writes:
On Wednesday, 1 July 2020 at 08:38:13 UTC, kinke wrote:
 On Wednesday, 1 July 2020 at 08:15:12 UTC, NilsLankila wrote:
 There's already a micro optim for AliasDeclaration, see 
 https://github.com/dlang/dmd/pull/11354. It should cut the use 
 by 25% (so the number for AliasDecl *only*), minus the problem 
 of allocations blocks, so maybe more 10% to 20% IRL.
For 64-bit, the size of a Declaration instance is 200 bytes (2.092 frontend), for AliasDeclaration 232 bytes, and with your change, down to 224 bytes, so the correct number is more like -3.5%. Thx for doing it anyway. Another thing to keep in mind is that all new'd allocations are padded to a multiple of 16-bytes, see https://github.com/dlang/dmd/blob/d97a908d35c8e6c22571688f12138862ef089337/src/ md/root/rmem.d#L176 (for the GC/-lowmem too, which has a bigger overhead).
Yeah you're right, I've realized later that I forgot the size of the inherited members.
Jul 09
prev sibling parent reply NilsLankila <NilsLankila gmx.us> writes:
On Tuesday, 30 June 2020 at 01:54:54 UTC, rikki cattermole wrote:
 On 30/06/2020 1:18 PM, Andrei Alexandrescu wrote:
 * Any work that reduces the number of TypeIdentifier, 
 TemplateInstance, etc. objects in the first place would help 
 quite a bit.
One strategy that is used with identifiers in dmd is to use a table lookup of strings to them. By making them unique, it makes them faster overall. If we could strip out the SLOC out of the TypeIdentifier class, this strategy could be used for it.
idPool is already used but only to guarantee identifiers uniqueness (i.e reject dups). I think that a second idpool could be used, this one to share those who has not be be unique, just like dparse's StringCache/internString()
Jul 01
parent NilsLankila <NilsLankila gmx.us> writes:
On Wednesday, 1 July 2020 at 08:26:05 UTC, NilsLankila wrote:
 On Tuesday, 30 June 2020 at 01:54:54 UTC, rikki cattermole [...]
 idPool is already used but only to guarantee identifiers 
 uniqueness (i.e reject dups). I think that a second idpool 
 could be used, this one to share those who has not be be 
 unique, just like dparse's StringCache/internString()
I've ~~spent~~ wasted some time yesterday evening to work on puting an intern system directly in the two Identifier __ctor that allocate, and this is not worth. only 300Kb won. but the overhead of the new table waste more. Actually most of the identifiers are already interned. Then, there's no more easy possible gain in the AST. Basically for Expression and Statement derived nodes there's nothing anymore. Walter has made a good optimisation for Type (-40Mo used for dparse). I've made 3 minors ones (-3Mo for dparse). Andrei made one for template instance but unfortunately it hasn't saved a full block of memory. What you have to understand, if you wish to propose something is that to reduce usage a class must have it's size div 16 taking one less, eg 32 to 17 has not impact but 32 to 15 does. Honestly I think this will be hard to do better **without** serious work and deep changes. Anyway for now compiling dparse using ~master and on x86_64 takes 904Mo VS 950Mo previously. That'd be nice if 3 or 4 micro optims would be found, so that it goes below 900 Megs, symbolically.
Jul 09
prev sibling next sibling parent reply Bruce Carneal <bcarneal gmail.com> writes:
On Tuesday, 30 June 2020 at 01:18:48 UTC, Andrei Alexandrescu 
wrote:
 I got a few numbers on what types dmd allocates the most while 
 compiling a large project.
[snip of interesting description of how the numbers were obtained]
 Looking at the top offenders:

 42634177 dmd.mtype.TypeIdentifier
 20452075 dmd.dtemplate.TemplateInstance
 20202329 dmd.dsymbol.DsymbolTable
 18783004 dmd.declaration.AliasDeclaration
 18224199 dmd.dsymbol.ScopeDsymbol
 18172133 dmd.mtype.Parameter
 14124126 dmd.expression.IntegerExp



 TypeIdentifier is likely to greatly relieve the number of 
 allocation calls. Here are a few thoughts on possible 
 improvements:
[snip of improvement descriptions] Do you know, or can you easily find out how many of these type identifiers might be eliminated by the improvements Stefan Koch and others have talked about recently? I'm not an expert but from those discussions it sounded like a great deal of type identifier generation activity was a by-product of compiler implementation; suffix appending recursions where tighter iterative forms might suffice.
Jun 29
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.com> writes:
On 6/29/20 9:56 PM, Bruce Carneal wrote:
 On Tuesday, 30 June 2020 at 01:18:48 UTC, Andrei Alexandrescu wrote:
 I got a few numbers on what types dmd allocates the most while 
 compiling a large project.
[snip of interesting description of how the numbers were obtained]
 Looking at the top offenders:

 42634177 dmd.mtype.TypeIdentifier
 20452075 dmd.dtemplate.TemplateInstance
 20202329 dmd.dsymbol.DsymbolTable
 18783004 dmd.declaration.AliasDeclaration
 18224199 dmd.dsymbol.ScopeDsymbol
 18172133 dmd.mtype.Parameter
 14124126 dmd.expression.IntegerExp



 likely to greatly relieve the number of allocation calls. Here are a 
 few thoughts on possible improvements:
[snip of improvement descriptions] Do you know, or can you easily find out how many of these type identifiers might be eliminated by the improvements Stefan Koch and others have talked about recently?
I don't know, maybe Stefan would. Anyway, I attach one more file in the format: total_bytes total_objects object_size type_name The top offenders are about the same, but now the top 2 are much closer to each other.
Jun 30
parent reply Avrina <avrina12309412342 gmail.com> writes:
On Tuesday, 30 June 2020 at 18:18:32 UTC, Andrei Alexandrescu 
wrote:
 On 6/29/20 9:56 PM, Bruce Carneal wrote:
 On Tuesday, 30 June 2020 at 01:18:48 UTC, Andrei Alexandrescu 
 wrote:
 I got a few numbers on what types dmd allocates the most 
 while compiling a large project.
[snip of interesting description of how the numbers were obtained]
 Looking at the top offenders:

 42634177 dmd.mtype.TypeIdentifier
 20452075 dmd.dtemplate.TemplateInstance
 20202329 dmd.dsymbol.DsymbolTable
 18783004 dmd.declaration.AliasDeclaration
 18224199 dmd.dsymbol.ScopeDsymbol
 18172133 dmd.mtype.Parameter
 14124126 dmd.expression.IntegerExp



 on TypeIdentifier is likely to greatly relieve the number of 
 allocation calls. Here are a few thoughts on possible 
 improvements:
[snip of improvement descriptions] Do you know, or can you easily find out how many of these type identifiers might be eliminated by the improvements Stefan Koch and others have talked about recently?
I don't know, maybe Stefan would. Anyway, I attach one more file in the format: total_bytes total_objects object_size type_name The top offenders are about the same, but now the top 2 are much closer to each other.
Can you post the project? If not, is there a similar one? I don't know of many big open source projects, especially ones that eat over 16 GB of memory.
Jul 09
parent Stefan Koch <uplink.coder googlemail.com> writes:
On Thursday, 9 July 2020 at 12:29:57 UTC, Avrina wrote:
 On Tuesday, 30 June 2020 at 18:18:32 UTC, Andrei Alexandrescu 
 wrote:
 On 6/29/20 9:56 PM, Bruce Carneal wrote:
 [...]
I don't know, maybe Stefan would. Anyway, I attach one more file in the format: total_bytes total_objects object_size type_name The top offenders are about the same, but now the top 2 are much closer to each other.
Can you post the project? If not, is there a similar one? I don't know of many big open source projects, especially ones that eat over 16 GB of memory.
Check out libdparse. Anything which uses std.regex. or the std.range unittests.
Jul 09
prev sibling next sibling parent Kagamin <spam here.lot> writes:
On Tuesday, 30 June 2020 at 01:18:48 UTC, Andrei Alexandrescu 
wrote:
 think that'd be not atypical for a D program because people who 
 wouldn't need that kind of stuff wouldn't derive much advantage 
 from using D in the first place.
It doesn't work like that, see https://www.teamten.com/lawrence/writings/java-for-everything.html And it doesn't matter how big is advantage, any advantage is more than enough, and I think D's advantage is considerable.
Jul 01
prev sibling parent NilsLankila <NilsLankila gmx.us> writes:
On Tuesday, 30 June 2020 at 01:18:48 UTC, Andrei Alexandrescu 
wrote:
 I got a few numbers on what types dmd allocates the most while 
 compiling a large project.
 [...]
 14124126 dmd.expression.IntegerExp
IntegerExp instances are already optimized using `.literal()` and `.createBool()`, so unfortunately this one is not really actionable, although it would take advantage of a dedicated allocator / memory pool.
Jul 01