www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - A program leaking memory.

reply =?ISO-8859-1?Q?Johan_Gr=F6nqvist?= <johan.gronqvist gmail.com> writes:
I have now transcribed a program from c++ to D, but when I run it, it
uses all memory and crashes.

I try to minimise a function of many variables iteratively using a
steepest descent method. The iteration steps are performed in a function
named step.

Step uses two vectors, and these are allocated anew in each step. After
the step-function has finished, I expected these vectors to be
garbage-collected and disappear.

I allocate vectors with:
  real[] grad;
  grad.lenth=1000;
  for (uint ix = 0; ix<grad.lenth; ix++) grad[ix] = gradient(ix);

I the use the vectors but do nothing to deallocate them or indicate that
I am done using them.

Memory usage grows until the program need too much memory. My
interpretation is that new vectors are allocated at each iteration, but
the old vectors are not freed.

The behaviour is the same using DMD and GDC.

Thanks in advance

/ johan
Mar 13 2006
parent reply Thomas Kuehne <thomas-dloop kuehne.cn> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Johan Grönqvist schrieb am 2006-03-13:
 I have now transcribed a program from c++ to D, but when I run it, it
 uses all memory and crashes.

 I try to minimise a function of many variables iteratively using a
 steepest descent method. The iteration steps are performed in a function
 named step.

 Step uses two vectors, and these are allocated anew in each step. After
 the step-function has finished, I expected these vectors to be
 garbage-collected and disappear.

 I allocate vectors with:
   real[] grad;
   grad.lenth=1000;
   for (uint ix = 0; ix<grad.lenth; ix++) grad[ix] = gradient(ix);

 I the use the vectors but do nothing to deallocate them or indicate that
 I am done using them.

 Memory usage grows until the program need too much memory. My
 interpretation is that new vectors are allocated at each iteration, but
 the old vectors are not freed.

 The behaviour is the same using DMD and GDC.

Does grad.length increase with each step call? (potential fragmentation) Is step an anonymous function or is it nested in an other function? Can you replace real[] grad; grad.length = 1000; with real[1000] grad; ? (depends on you problem and your design) Neither the internal GC stats nor external tools did show any GC misbehaviour: # import std.random; # import std.gc; # import std.stdio; # # real gradient(size_t x){ # return x * rand(); # } # # void step(){ # real[] grad; # grad.length = 1000; # # for(size_t ix = 0; ix < grad.length; ix++){ # grad[ix] = gradient(ix); # } # } # # int main(){ # size_t i = 0; # GCStats gc; # while(i++ < i.max){ # step(); # getStats(gc); # writefln("[%s] pool:%s used:%s free:%s FREE:%s PAGE:%s", # i, gc.poolsize, gc.usedsize, gc.freelistsize, # gc.freeblocks, gc.pageblocks); # } # return 0; # } Thomas -----BEGIN PGP SIGNATURE----- iD8DBQFEFbLp3w+/yD4P9tIRAoQEAJ93ePekWy3TTnPNBSVt48RtPqhRCwCfb+rF oO+/3cS25HMU8AYkwNFyfQ8= =5q+D -----END PGP SIGNATURE-----
Mar 13 2006
next sibling parent Dejan Lekic <dejan nu6.org> writes:
Mr. Kuehne, your code does not work on my Linux box with DMD 0.149.
Here is the output of compiler:
/usr/src/phobos/std/gc.d(43): identifier 'GCStats' is not defined
/usr/src/phobos/std/gc.d(43): GCStats is used as a type
/usr/src/phobos/std/gc.d(43): cannot have parameter of type void

Regards

Dejan
Mar 13 2006
prev sibling parent reply =?UTF-8?B?Sm9oYW4gR3LDtm5xdmlzdA==?= <johan.gronqvist gmail.com> writes:
Thomas: Thak you very much for your answer!

Now I have found the cause to this problem,

The garbage collector collects so rarely that my system starts swapping
intensely.

This time I only used gdc.
(gcc version 4.0.1, gdc 0.17, using dmd 0.140)

Any hints to the solution are most welcome!

Is there a way to tell the gc never to use more than (say) 100MB of memory?
I could insert an if-clause in each step, calling the collector if some
limit is exceeded, but if feels a bit inelegant.


Comments to Thomas Kuehnes very helpful answer below.

/ johan




1) Static vs. Dynamic arrays:
--------------------------------------------------------
The system size I use will not be known at compile time, and my
understanding is that I can therefore not use

real[myLength] grad;

but have to use

real[] grad;
grad.length = myLength;

where myLength is computed earlier.

The arrays I initialize in this way are never again resized. With this I
mean that the two first things I do are

real[] grad;
grad.lenth = myLength;

thereafter I modify the elements and read off their values a lot, but I
never resize the array again, i.e., the memory need for grad should
never change during a step, and as soon as one step ends and the next
starts the old grad-vector is no loner needed.
--------------------------------------------------------


2) Garbage collection
------------------------------------------
My main loop calls step each time, and I pasted your code checking the
state of the gc into my loop.

First some (selected) lines:
[3] pool:1048576 used:5072 free:19504 FREE:28 PAGE:3
[24] pool:3997696 used:10224 free:18448 FREE:81 PAGE:12
[38] pool:3997696 used:5200 free:19376 FREE:230 PAGE:10
[66] pool:16121856 used:25280 free:15680 FREE:300 PAGE:49
[77] pool:22675456 used:5200 free:19376 FREE:276 PAGE:71

So far everything is fine. Memory usage seems to fluctuate, steadily
upwards with each iteration, and dropping to ~5200 with irregular intervals.

From here on, however, it never reaches down to the same levels again,
and (restricting to 200 iterations) the end of the output is

[199] pool:99352576 used:35744 free:29792 FREE:930 PAGE:315
[200] pool:99680256 used:38272 free:27264 FREE:862 PAGE:317

I would like it to do gc before my system starts swapping.
My interpretation of the above is that it does gc to rarely.


After I insert a fullCollect() every iteration, my output ends with

[193] pool:4653056 used:2736 free:17744 FREE:95 PAGE:14
[194] pool:5308416 used:2704 free:17776 FREE:107 PAGE:16
[195] pool:5963776 used:2736 free:17744 FREE:119 PAGE:18
[196] pool:6619136 used:2720 free:17760 FREE:131 PAGE:20
[197] pool:7274496 used:2736 free:17744 FREE:143 PAGE:22
[198] pool:7929856 used:2736 free:17744 FREE:155 PAGE:24
[199] pool:8585216 used:2736 free:17744 FREE:167 PAGE:26
[200] pool:9240576 used:2736 free:17744 FREE:179 PAGE:28
and I can see that this is much better.

My next guess would be to include calls to the generational garbage
collector at each iteration (or when memory exceeds a certain amount),
but I do not know would be the best solution.




Thomas Kuehne wrote:
 Johan Grýnqvist schrieb am 2006-03-13:
I have now transcribed a program from c++ to D, but when I run it, it
uses all memory and crashes.

I try to minimise a function of many variables iteratively using a
steepest descent method. The iteration steps are performed in a function
named step.

Step uses two vectors, and these are allocated anew in each step. After
the step-function has finished, I expected these vectors to be
garbage-collected and disappear.

I allocate vectors with:
  real[] grad;
  grad.lenth=1000;
  for (uint ix = 0; ix<grad.lenth; ix++) grad[ix] = gradient(ix);

I the use the vectors but do nothing to deallocate them or indicate that
I am done using them.

Memory usage grows until the program need too much memory. My
interpretation is that new vectors are allocated at each iteration, but
the old vectors are not freed.

The behaviour is the same using DMD and GDC.


Does grad.length increase with each step call? (potential fragmentation) Is step an anonymous function or is it nested in an other function? Can you replace real[] grad; grad.length = 1000; with real[1000] grad; ? (depends on you problem and your design) Neither the internal GC stats nor external tools did show any GC misbehaviour: # import std.random; # import std.gc; # import std.stdio; # # real gradient(size_t x){ # return x * rand(); # } # # void step(){ # real[] grad; # grad.length = 1000; # # for(size_t ix = 0; ix < grad.length; ix++){ # grad[ix] = gradient(ix); # } # } # # int main(){ # size_t i = 0; # GCStats gc; # while(i++ < i.max){ # step(); # getStats(gc); # writefln("[%s] pool:%s used:%s free:%s FREE:%s PAGE:%s", # i, gc.poolsize, gc.usedsize, gc.freelistsize, # gc.freeblocks, gc.pageblocks); # } # return 0; # } Thomas

Mar 13 2006
next sibling parent Hasan Aljudy <hasan.aljudy gmail.com> writes:
Johan Grönqvist wrote:
 Thomas: Thak you very much for your answer!
 
 Now I have found the cause to this problem,
 
 The garbage collector collects so rarely that my system starts swapping
 intensely.
 
 This time I only used gdc.
 (gcc version 4.0.1, gdc 0.17, using dmd 0.140)
 
 Any hints to the solution are most welcome!
 
 Is there a way to tell the gc never to use more than (say) 100MB of memory?
 I could insert an if-clause in each step, calling the collector if some
 limit is exceeded, but if feels a bit inelegant.
 
 
 Comments to Thomas Kuehnes very helpful answer below.
 
 / johan
 

maybe you can call gc.fullCollect() every once in a while http://www.digitalmars.com/d/phobos/std_gc.html
Mar 13 2006
prev sibling next sibling parent Chris Nicholson-Sauls <ibisbasenji gmail.com> writes:
Johan Grönqvist wrote:
 1) Static vs. Dynamic arrays:
 --------------------------------------------------------
 The system size I use will not be known at compile time, and my
 understanding is that I can therefore not use
 
 real[myLength] grad;
 
 but have to use
 
 real[] grad;
 grad.length = myLength;
 
 where myLength is computed earlier.
 
 The arrays I initialize in this way are never again resized. With this I
 mean that the two first things I do are
 
 real[] grad;
 grad.lenth = myLength;
 
 thereafter I modify the elements and read off their values a lot, but I
 never resize the array again, i.e., the memory need for grad should
 never change during a step, and as soon as one step ends and the next
 starts the old grad-vector is no loner needed.

Since you never resize the array, it should be perfectly feasible to use `real[myLength] grad;` instead. You might also try using 'foreach' but that doesn't solve your problem, just makes it more readable (IMHO). One thing you might try is: # real[] grad = new real[myLength]; # // ... loop ... # delete grad; I'm pretty sure you can use 'delete' with anything 'new'd, as in C++. -- Chris Nicholson-Sauls
Mar 13 2006
prev sibling next sibling parent reply Thomas Kuehne <thomas-dloop kuehne.cn> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Johan Grönqvist schrieb am 2006-03-13:

[snip]

 2) Garbage collection
 ------------------------------------------
 My main loop calls step each time, and I pasted your code checking the
 state of the gc into my loop.

 First some (selected) lines:
 [3] pool:1048576 used:5072 free:19504 FREE:28 PAGE:3
 [24] pool:3997696 used:10224 free:18448 FREE:81 PAGE:12
 [38] pool:3997696 used:5200 free:19376 FREE:230 PAGE:10
 [66] pool:16121856 used:25280 free:15680 FREE:300 PAGE:49
 [77] pool:22675456 used:5200 free:19376 FREE:276 PAGE:71

 So far everything is fine. Memory usage seems to fluctuate, steadily
 upwards with each iteration, and dropping to ~5200 with irregular intervals.

 From here on, however, it never reaches down to the same levels again,
 and (restricting to 200 iterations) the end of the output is

 [199] pool:99352576 used:35744 free:29792 FREE:930 PAGE:315
 [200] pool:99680256 used:38272 free:27264 FREE:862 PAGE:317

 I would like it to do gc before my system starts swapping.
 My interpretation of the above is that it does gc to rarely.

This isn't the only possible cause. What happens if you call fullCollect() after iteration 199? (no fullCollect() calls from your code between iteration 1 and 199)
 After I insert a fullCollect() every iteration, my output ends with

 [193] pool:4653056 used:2736 free:17744 FREE:95 PAGE:14
 [194] pool:5308416 used:2704 free:17776 FREE:107 PAGE:16
 [195] pool:5963776 used:2736 free:17744 FREE:119 PAGE:18
 [196] pool:6619136 used:2720 free:17760 FREE:131 PAGE:20
 [197] pool:7274496 used:2736 free:17744 FREE:143 PAGE:22
 [198] pool:7929856 used:2736 free:17744 FREE:155 PAGE:24
 [199] pool:8585216 used:2736 free:17744 FREE:167 PAGE:26
 [200] pool:9240576 used:2736 free:17744 FREE:179 PAGE:28
 and I can see that this is much better.

 My next guess would be to include calls to the generational garbage
 collector at each iteration (or when memory exceeds a certain amount),
 but I do not know would be the best solution.

Let's try to make sure that the array is the cause. 1) add the imports # import std.c.stdlib; # import std.string; # import std.gc; 2) replace # real[] grad; # grad.length = 1000; with # real[] grad; # grad = (cast(real*) std.c.stdlib.calloc(1000, typeof(grad[0]).sizeof))[0 .. 1000]; # addRoot(grad.ptr); 3) add the following to the end of the function: # memset(grad.ptr, 0, grad.length * typeof(grad[0]).sizeof); # removeRoot(grad.ptr); # std.c.stdlib.free(grad.ptr); Even better would be if you could use single allocation of the array. requires: * the size doesn't change between repeated calles to "step" * only on thread at a time executes "step" * step isn't recursive If too few collection calls are part of the problem, then phobos/internal/gc/gcx.d:mallocNoSync might be a starting point. Thomas -----BEGIN PGP SIGNATURE----- iD8DBQFEFsSa3w+/yD4P9tIRAnoDAKCiOFgfji5xk0WxWJd1MkymFDh/bQCfZUzF 0/5IZCR6dL/f4uKHlw/7gug= =L+71 -----END PGP SIGNATURE-----
Mar 14 2006
parent reply =?UTF-8?B?Sm9oYW4gR3LDtm5xdmlzdA==?= <johan.gronqvist gmail.com> writes:
Thomas Kuehne wrote:
 Johan Grýnqvist schrieb am 2006-03-13:
 
 [snip]
 

Before I start, my main loop now looks like int iter = 0; while (iter < 500) { newEnergy = canti.step(lastEnergy); lastEnergy = newEnergy; iter++; fullCollect(); getStats(gc); }; where I have removed lines only doing output to make it clearer. As canti.step does not have access to the variable iter, I am confident that all iterations should behave the same way as far as memory management goes.
[66] pool:16121856 used:25280 free:15680 FREE:300 PAGE:49
[77] pool:22675456 used:5200 free:19376 FREE:276 PAGE:71

So far everything is fine. Memory usage seems to fluctuate, steadily
upwards with each iteration, and dropping to ~5200 with irregular intervals.

From here on, however, it never reaches down to the same levels again,
and (restricting to 200 iterations) the end of the output is

[199] pool:99352576 used:35744 free:29792 FREE:930 PAGE:315
[200] pool:99680256 used:38272 free:27264 FREE:862 PAGE:317

I would like it to do gc before my system starts swapping.
My interpretation of the above is that it does gc to rarely.


This isn't the only possible cause. What happens if you call fullCollect() after iteration 199? (no fullCollect() calls from your code between iteration 1 and 199)

Ok: fullCollect after 200 and 400 iterations, running for a total of 500 iterations. [197] pool:97124352 used:21520 free:44016 FREE:682 PAGE:311 [198] pool:97779712 used:24048 free:41488 FREE:694 PAGE:313 [199] pool:98435072 used:26576 free:38960 FREE:706 PAGE:315 Collecting garbage [200] pool:99090432 used:13984 free:47456 FREE:719 PAGE:317 [201] pool:99418112 used:16496 free:44944 FREE:651 PAGE:319 [202] pool:100073472 used:19024 free:42416 FREE:663 PAGE:321 Hmmm... It no longer drops to close to 5000, and the pool is HUGE. I assume tis is what my OS sees the program using, is that correct? [260] pool:135462912 used:48416 free:21216 FREE:717 PAGE:437 [261] pool:136118272 used:50944 free:18688 FREE:729 PAGE:439 At this point (261 iterations) I get from top that PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3611 johan 24 0 146m 131m 596 S 0.0 27.0 0:26.96 a.out 7707 root 15 0 216m 81m 3076 S 2.1 16.7 16:16.29 Xorg indicating that "pool" is what determines if my system swaps or not. The pool continues to increase. [397] pool:218365952 used:51552 free:42656 FREE:675 PAGE:711 [398] pool:219021312 used:54080 free:40128 FREE:687 PAGE:713 [399] pool:219676672 used:56608 free:37600 FREE:699 PAGE:715 Collecting garbage [400] pool:220332032 used:50048 free:44160 FREE:711 PAGE:717 [401] pool:220987392 used:52576 free:41632 FREE:723 PAGE:719 [402] pool:221642752 used:55104 free:39104 FREE:735 PAGE:721 This collect after 400 iteration does almost nothing!! And it does not release any memory to the system! PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3611 johan 18 0 228m 218m 596 S 0.0 44.7 0:49.88 a.out 3097 johan 16 0 87904 38m 13m S 0.0 8.0 0:16.55 kpdf Going on to the end... [498] pool:279642112 used:96896 free:25984 FREE:680 PAGE:913 [499] pool:280297472 used:99424 free:23456 FREE:692 PAGE:915 [500] pool:280952832 used:101952 free:20928 FREE:704 PAGE:917
After I insert a fullCollect() every iteration, my output ends with



[11] pool:2031616 used:2672 free:17808 FREE:343 PAGE:2 [12] pool:2031616 used:2688 free:17792 FREE:343 PAGE:2 [13] pool:2031616 used:2688 free:17792 FREE:343 PAGE:2 [341] pool:3997696 used:2736 free:17744 FREE:823 PAGE:2 [342] pool:3997696 used:2736 free:17744 FREE:823 PAGE:2 [343] pool:3997696 used:2736 free:17744 FREE:823 PAGE:2 So far good, but something happens: [350] pool:3997696 used:2736 free:17744 FREE:379 PAGE:8 [351] pool:3997696 used:2736 free:17744 FREE:231 PAGE:10 [352] pool:3997696 used:2736 free:17744 FREE:83 PAGE:12 [353] pool:4653056 used:3760 free:20816 FREE:94 PAGE:14 [354] pool:5308416 used:3760 free:20816 FREE:106 PAGE:16 [355] pool:5963776 used:3760 free:20816 FREE:118 PAGE:18 [356] pool:6619136 used:3760 free:20816 FREE:130 PAGE:20 and the end reads: [497] pool:93126656 used:6896 free:21776 FREE:381 PAGE:302 [498] pool:93782016 used:6896 free:21776 FREE:393 PAGE:304 [499] pool:94437376 used:6896 free:21776 FREE:405 PAGE:306 [500] pool:95092736 used:6896 free:21776 FREE:417 PAGE:308 Thus, from iteration 13 to 341 the pool changes by a factor of 2. From iteration 341 to 500 it changes by a factor of 30, if I read it correctly. I see no reason why.
 
 Let's try to make sure that the array is the cause.
 
 Even better would be if you could use single allocation of the array.

With one single allocation for all arrys involved EXCELLENT!! [47] pool:1048576 used:120080 free:15088 FREE:1 PAGE:3 [48] pool:1048576 used:122592 free:16672 FREE:0 PAGE:3 [49] pool:1048576 used:5232 free:35728 FREE:24 PAGE:3 [50] pool:1048576 used:7744 free:33216 FREE:24 PAGE:3 [95] pool:1048576 used:120736 free:18528 FREE:0 PAGE:3 [96] pool:1048576 used:123248 free:16016 FREE:0 PAGE:3 [97] pool:1048576 used:5232 free:23440 FREE:27 PAGE:3 [98] pool:1048576 used:7728 free:20944 FREE:27 PAGE:3 [142] pool:1048576 used:118848 free:16320 FREE:1 PAGE:3 [143] pool:1048576 used:121376 free:17888 FREE:0 PAGE:3 [144] pool:1048576 used:123904 free:15360 FREE:0 PAGE:3 [145] pool:1048576 used:5424 free:31440 FREE:25 PAGE:3 [146] pool:1048576 used:7952 free:28912 FREE:25 PAGE:3 [147] pool:1048576 used:10480 free:30480 FREE:24 PAGE:3 .... [496] pool:1048576 used:48176 free:13264 FREE:19 PAGE:3 [497] pool:1048576 used:50704 free:14832 FREE:18 PAGE:3 [498] pool:1048576 used:53232 free:12304 FREE:18 PAGE:3 [499] pool:1048576 used:55760 free:13872 FREE:17 PAGE:3 [500] pool:1048576 used:58288 free:11344 FREE:17 PAGE:3 As soon as it reaches the pool-size, everything is collected, and the used memory is small. I do not see why. I thought the GC would find all forgotten arrays that have been allocated at fullCollect time. As the problem was not in the number of calls to the garbage collector, I still do not understand this. Thanks for the help! / johan
Mar 17 2006
parent Thomas Kuehne <thomas-dloop kuehne.cn> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Johan Grönqvist schrieb am 2006-03-17:
 Thomas Kuehne wrote:
 Johan Grýnqvist schrieb am 2006-03-13:


[snip]
 Ok: fullCollect after 200 and 400 iterations, running for a total of 500
 iterations.

 [197] pool:97124352 used:21520 free:44016 FREE:682 PAGE:311
 [198] pool:97779712 used:24048 free:41488 FREE:694 PAGE:313
 [199] pool:98435072 used:26576 free:38960 FREE:706 PAGE:315
 Collecting garbage

 [200] pool:99090432 used:13984 free:47456 FREE:719 PAGE:317
 [201] pool:99418112 used:16496 free:44944 FREE:651 PAGE:319
 [202] pool:100073472 used:19024 free:42416 FREE:663 PAGE:321

 Hmmm... It no longer drops to close to 5000, and the pool is HUGE. 

[snip]
 [397] pool:218365952 used:51552 free:42656 FREE:675 PAGE:711
 [398] pool:219021312 used:54080 free:40128 FREE:687 PAGE:713
 [399] pool:219676672 used:56608 free:37600 FREE:699 PAGE:715
 Collecting garbage

 [400] pool:220332032 used:50048 free:44160 FREE:711 PAGE:717
 [401] pool:220987392 used:52576 free:41632 FREE:723 PAGE:719
 [402] pool:221642752 used:55104 free:39104 FREE:735 PAGE:721

 This collect after 400 iteration does almost nothing!!
 And it does not release any memory to the system!

[snip]
 Going on to the end...

 [498] pool:279642112 used:96896 free:25984 FREE:680 PAGE:913
 [499] pool:280297472 used:99424 free:23456 FREE:692 PAGE:915
 [500] pool:280952832 used:101952 free:20928 FREE:704 PAGE:917

After I insert a fullCollect() every iteration, my output ends with



[11] pool:2031616 used:2672 free:17808 FREE:343 PAGE:2 [12] pool:2031616 used:2688 free:17792 FREE:343 PAGE:2 [13] pool:2031616 used:2688 free:17792 FREE:343 PAGE:2 [341] pool:3997696 used:2736 free:17744 FREE:823 PAGE:2 [342] pool:3997696 used:2736 free:17744 FREE:823 PAGE:2 [343] pool:3997696 used:2736 free:17744 FREE:823 PAGE:2 So far good, but something happens: [350] pool:3997696 used:2736 free:17744 FREE:379 PAGE:8 [351] pool:3997696 used:2736 free:17744 FREE:231 PAGE:10 [352] pool:3997696 used:2736 free:17744 FREE:83 PAGE:12 [353] pool:4653056 used:3760 free:20816 FREE:94 PAGE:14 [354] pool:5308416 used:3760 free:20816 FREE:106 PAGE:16 [355] pool:5963776 used:3760 free:20816 FREE:118 PAGE:18 [356] pool:6619136 used:3760 free:20816 FREE:130 PAGE:20 and the end reads: [497] pool:93126656 used:6896 free:21776 FREE:381 PAGE:302 [498] pool:93782016 used:6896 free:21776 FREE:393 PAGE:304 [499] pool:94437376 used:6896 free:21776 FREE:405 PAGE:306 [500] pool:95092736 used:6896 free:21776 FREE:417 PAGE:308 Thus, from iteration 13 to 341 the pool changes by a factor of 2. From iteration 341 to 500 it changes by a factor of 30, if I read it correctly.

The small increase of "used" and "free" doesn't seem critical. Something else seems odd: the statistic gathering of the GC iteration 11: pool / (FREE + PAGE) -> 5888 (free + used) / (FREE + PAGE) -> 59 pool / (free + used) -> 99 iteration 350: pool / (FREE + PAGE) -> 10329 (free + used) / (FREE + PAGE) -> 52 pool / (free + used) -> 195 iteration 500: pool / (FREE + PAGE) -> 13116 (free + used) / (FREE + PAGE) -> 39 pool / (free + used) -> 331 Part of the problem might be that B_PAGEPLUS isn't handled in getStats (-> internal/gc/gcx.d:665). This would only explain why the numbers above don't add up for "free", "used", "FREE" and "PAGE" - "pool" seems to be correct.
 Let's try to make sure that the array is the cause.
 
 Even better would be if you could use single allocation of the array.

With one single allocation for all arrys involved EXCELLENT!! [47] pool:1048576 used:120080 free:15088 FREE:1 PAGE:3 [48] pool:1048576 used:122592 free:16672 FREE:0 PAGE:3 [49] pool:1048576 used:5232 free:35728 FREE:24 PAGE:3 [50] pool:1048576 used:7744 free:33216 FREE:24 PAGE:3 [95] pool:1048576 used:120736 free:18528 FREE:0 PAGE:3 [96] pool:1048576 used:123248 free:16016 FREE:0 PAGE:3 [97] pool:1048576 used:5232 free:23440 FREE:27 PAGE:3 [98] pool:1048576 used:7728 free:20944 FREE:27 PAGE:3 [142] pool:1048576 used:118848 free:16320 FREE:1 PAGE:3 [143] pool:1048576 used:121376 free:17888 FREE:0 PAGE:3 [144] pool:1048576 used:123904 free:15360 FREE:0 PAGE:3 [145] pool:1048576 used:5424 free:31440 FREE:25 PAGE:3 [146] pool:1048576 used:7952 free:28912 FREE:25 PAGE:3 [147] pool:1048576 used:10480 free:30480 FREE:24 PAGE:3 .... [496] pool:1048576 used:48176 free:13264 FREE:19 PAGE:3 [497] pool:1048576 used:50704 free:14832 FREE:18 PAGE:3 [498] pool:1048576 used:53232 free:12304 FREE:18 PAGE:3 [499] pool:1048576 used:55760 free:13872 FREE:17 PAGE:3 [500] pool:1048576 used:58288 free:11344 FREE:17 PAGE:3 As soon as it reaches the pool-size, everything is collected, and the used memory is small. I do not see why. I thought the GC would find all forgotten arrays that have been allocated at fullCollect time.

pool size within the first few iterations. single allocation:
 [497] pool:1048576 used:50704 free:14832 FREE:18 PAGE:3
 [498] pool:1048576 used:53232 free:12304 FREE:18 PAGE:3
 [499] pool:1048576 used:55760 free:13872 FREE:17 PAGE:3
 [500] pool:1048576 used:58288 free:11344 FREE:17 PAGE:3

collect after every iteration:
 [497] pool:93126656 used:6896 free:21776 FREE:381 PAGE:302
 [498] pool:93782016 used:6896 free:21776 FREE:393 PAGE:304
 [499] pool:94437376 used:6896 free:21776 FREE:405 PAGE:306
 [500] pool:95092736 used:6896 free:21776 FREE:417 PAGE:308

collect before iteration 200 and 400:
 [498] pool:279642112 used:96896 free:25984 FREE:680 PAGE:913
 [499] pool:280297472 used:99424 free:23456 FREE:692 PAGE:915
 [500] pool:280952832 used:101952 free:20928 FREE:704 PAGE:917

(keep the odd statistic gathering in mind (see above)) As the length of the array doesn't change between iteration I assume that 2 arrays are newed - both containing at least 4097(PAGESIZE+1) bytes of data and in sum between 589826(PAGESIZE*145+2) and 598016(PAGESIZE*146) bytes of data. Thus in average 72 B_PAGEPLUS pages per B_PAGE page. single allocation (iteration 5000): pool / (FREE + (72 + 1) * PAGE) -> 4443 collect after every iteration(iteration 500): pool / (FREE + (72 + 1) * PAGE) -> 4152 collect before iteration 200 and 400(iteration 500): pool / (FREE + (72 + 1) * PAGE) -> 4153 It doesn't look like there are any hidden increases of the array length during repeated calls to the "step" function or within the "step" function. Seems like large allocations aren't collected properly under some circumstances: collect after every iteration:
 [341] pool:3997696 used:2736 free:17744 FREE:823 PAGE:2
 [342] pool:3997696 used:2736 free:17744 FREE:823 PAGE:2
 [343] pool:3997696 used:2736 free:17744 FREE:823 PAGE:2

 So far good, but something happens:

 [350] pool:3997696 used:2736 free:17744 FREE:379 PAGE:8
 [351] pool:3997696 used:2736 free:17744 FREE:231 PAGE:10
 [352] pool:3997696 used:2736 free:17744 FREE:83 PAGE:12
 [353] pool:4653056 used:3760 free:20816 FREE:94 PAGE:14

This is certanly the interresting point. Either some of your data begins to look like a pointer into managed memory somewhere around iteration 348 or there is some serious issue with the GC. Checking the "pointer lookalike" issue isn't trivial. As a start ensure that all your "data" numbers(array, temporaries, etc) are between 0 and 1024 while withing the iteration loop. If the gigantic pool size remains you should file a bug (http://d.puremagic.com/bugzilla) with attached minimal but runnable testcase sources or - if there are any IP issues with your code - read http://www.digitalmars.com/bugs.html and send Walter the bug. Thomas -----BEGIN PGP SIGNATURE----- iD8DBQFEHJlf3w+/yD4P9tIRAhuXAJwIZiLGomTusBJGKXhH/AtazHe5HACfZluJ nybaz2n1i1jeXWgr/cANXls= =Q/4U -----END PGP SIGNATURE-----
Mar 18 2006
prev sibling parent Nick <Nick_member pathlink.com> writes:
Johan Gronqvist spake thusly:
Is there a way to tell the gc never to use more than (say) 100MB of memory?

Nope, not at the moment, but it would be nice if there was.
The arrays I initialize in this way are never again resized. With this I
mean that the two first things I do are

real[] grad;
grad.lenth = myLength;

thereafter I modify the elements and read off their values a lot, but I
never resize the array again, i.e., the memory need for grad should
never change during a step, and as soon as one step ends and the next
starts the old grad-vector is no loner needed.

Whenever you have such a planned memory usage (ie. allocate a fixed array, use once, discard, then repeat many times), it isn't terribly efficient to rely on the GC to pick up all your discarded memory. But that's OK, D doesn't force you to. The first thing I would try is to put the array outside the step() function, like this: # read grad[]; # # void step() # { # grad.length = whatever; # ... // Use it here # } This will resize the _same_ array over and over, instead of creating a new one each time. This will often (but not always) resize in place, thus saving memory. Especially in situations like this: # ... # grad.length = 100000; // Big allocation # ... # grad.length = 10; // Resize does not move array # # grad.length = 100000; // The already array points to a block large # // enough to hold this data, so it does not move. Thus this allocates a total of 100000 reals, not 200010. Hope this helps. If not there are other more advanced techniques one can use. Nick
Mar 18 2006