digitalmars.D - either me or GC sux badly (GC don't reuse free memory)
- ketmar via Digitalmars-d (31/31) Nov 12 2014 Hello.
- thedeemon (9/14) Nov 12 2014 Sure: just make the GC precise, not conservative. ;)
- ketmar via Digitalmars-d (5/21) Nov 12 2014 even with NO_INTERIOR the sample keeps failing (yet after more
- ketmar via Digitalmars-d (21/37) Nov 12 2014 but 'mkay, let's change the sample a little:
- Matthias Bentrup (7/47) Nov 12 2014 On Linux/x86 you have only 3 GB virtual address space, and this
- ketmar via Digitalmars-d (9/61) Nov 12 2014 i know it. what i can't get is why D allocates more and more address
- Kagamin (4/11) Nov 12 2014 Maybe you fragmented the heap and don't have 1.7GB of contiguous
- ketmar via Digitalmars-d (6/18) Nov 12 2014 i gave two example programs, which demonstrates the effect, and they
- Kagamin (5/6) Nov 12 2014 Why do you think so?
- ketmar via Digitalmars-d (4/11) Nov 12 2014 i mean "from allocations in other places of the progam", not the
- ketmar via Digitalmars-d (5/21) Nov 12 2014 for information: yes, RES is jumping high and low as it should. but
- Kagamin (3/6) Nov 12 2014 Try to allocate the arrays with NO_SCAN flag.
- ketmar via Digitalmars-d (4/11) Nov 12 2014 why this must make any difference in the demonstrated cases? ubyte
- Steven Schveighoffer (3/8) Nov 12 2014 Really that shouldn't matter. The arrays should all be 0-initialized.
- ketmar via Digitalmars-d (4/11) Nov 12 2014 btw, compiler is smart enough to allocate array with NO_SCAN flag, i
- Steven Schveighoffer (16/47) Nov 12 2014 I think I might know what's going on.
- ketmar via Digitalmars-d (14/74) Nov 12 2014 On Wed, 12 Nov 2014 10:51:31 -0500
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (5/8) Nov 12 2014 The gc uses C's calloc rather implementing memory handling itself
- ketmar via Digitalmars-d (8/18) Nov 12 2014 d#L2223
- ketmar via Digitalmars-d (7/17) Nov 12 2014 d#L2223
- Steven Schveighoffer (8/17) Nov 12 2014 I don't know the internals of C malloc. But I think it should be
- ketmar via Digitalmars-d (5/8) Nov 12 2014 On Wed, 12 Nov 2014 11:20:48 -0500
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (3/5) Nov 12 2014 Yes, but then D must provide a malloc replacement.
- Sean Kelly (5/5) Nov 12 2014 It's been a while since I Dded this, but I think the GC will
- Sean Kelly (3/3) Nov 12 2014 Try following the big allocation with a really small allocation
- ketmar via Digitalmars-d (4/7) Nov 12 2014 but this clearly not an issue with sample which does `GC.free()`, and
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (5/15) Nov 12 2014 You should get debuginfo by compiling the runtime with the PRINTF
- Kagamin (6/10) Nov 13 2014 GC probably allocates some small blocks for pools and other data.
Hello. let's run this program: import core.sys.posix.unistd; import std.stdio; import core.memory; void main () { uint size =3D 1024*1024*300; for (;;) { auto buf =3D new ubyte[](size); writefln("%s", size); sleep(1); size +=3D 1024*1024*100; buf =3D null; GC.collect(); GC.minimize(); } } pretty innocent, right? i even trying to help GC here. but... 314572800 419430400 524288000 629145600 734003200 core.exception.OutOfMemoryError (0) oooops. by the way, this is not actually "no more memory", this is "i'm out of address space" (yes, i'm on 32-bit system, GNU/Linux). the question is: am i doing something wrong here? how can i force GC to stop eating my address space and reuse what it already has? sure, i can use libc malloc(), refcounting, and so on, but the question remains: why GC not reusing already allocated and freed memory?
Nov 12 2014
On Wednesday, 12 November 2014 at 11:05:11 UTC, ketmar via Digitalmars-d wrote:734003200 address space" (yes, i'm on 32-bit system, GNU/Linux). the question is: am i doing something wrong here? how can i force GC to stop eating my address space and reuse what it already has?Sure: just make the GC precise, not conservative. ;) With current GC implementation and array this big chances of having a word on the stack that looks like a pointer to it and prevents it from being collected are almost 100%. Just don't store big arrays in GC heap or switch to 64 bits where the problem is not that bad since address space is much larger and chances of false pointers are much smaller.
Nov 12 2014
On Wed, 12 Nov 2014 12:05:25 +0000 thedeemon via Digitalmars-d <digitalmars-d puremagic.com> wrote:On Wednesday, 12 November 2014 at 11:05:11 UTC, ketmar via=20 Digitalmars-d wrote:even with NO_INTERIOR the sample keeps failing (yet after more iterations). no, really, there is ALWAYS a pointer exactly to the start of the allocated buffer somewhere? i feel something smelly here.734003200 address space" (yes, i'm on 32-bit system, GNU/Linux). the question is: am i doing something wrong here? how can i=20 force GC to stop eating my address space and reuse what it=20 already has?=20 Sure: just make the GC precise, not conservative. ;) With current GC implementation and array this big chances of=20 having a word on the stack that looks like a pointer to it and=20 prevents it from being collected are almost 100%. Just don't=20 store big arrays in GC heap or switch to 64 bits where the=20 problem is not that bad since address space is much larger and=20 chances of false pointers are much smaller.
Nov 12 2014
On Wed, 12 Nov 2014 12:05:25 +0000 thedeemon via Digitalmars-d <digitalmars-d puremagic.com> wrote:On Wednesday, 12 November 2014 at 11:05:11 UTC, ketmar via=20 Digitalmars-d wrote:but 'mkay, let's change the sample a little: import core.memory; import std.stdio; void main () { uint size =3D 1024*1024*300; for (;;) { auto buf =3D new ubyte[](size); writefln("%s", size); size +=3D 1024*1024*100; GC.free(GC.addrOf(buf.ptr)); buf =3D null; GC.collect(); GC.minimize(); } } this shouldn't fail so soon, right? i'm freeing the memory, so... it still dying on 1,887,436,800. 1.7GB and that's all? this can't be true, i have 3GB of free RAM (with 1.2GB used) and 8GB of unused swap. and yes, it consumed all of the process address space again.734003200 address space" (yes, i'm on 32-bit system, GNU/Linux). the question is: am i doing something wrong here? how can i=20 force GC to stop eating my address space and reuse what it=20 already has?=20 Sure: just make the GC precise, not conservative. ;) With current GC implementation and array this big chances of=20 having a word on the stack that looks like a pointer to it and=20 prevents it from being collected are almost 100%. Just don't=20 store big arrays in GC heap or switch to 64 bits where the=20 problem is not that bad since address space is much larger and=20 chances of false pointers are much smaller.
Nov 12 2014
On Wednesday, 12 November 2014 at 12:30:15 UTC, ketmar via Digitalmars-d wrote:On Wed, 12 Nov 2014 12:05:25 +0000 thedeemon via Digitalmars-d <digitalmars-d puremagic.com> wrote:On Linux/x86 you have only 3 GB virtual address space, and this has to include the program code + all loaded libraries too. Check out /proc/<pid>/maps, to see where the dlls are loaded, and look at the largest chunk of free space available. That is the theoretical limit that could be allocated.On Wednesday, 12 November 2014 at 11:05:11 UTC, ketmar via Digitalmars-d wrote:but 'mkay, let's change the sample a little: import core.memory; import std.stdio; void main () { uint size = 1024*1024*300; for (;;) { auto buf = new ubyte[](size); writefln("%s", size); size += 1024*1024*100; GC.free(GC.addrOf(buf.ptr)); buf = null; GC.collect(); GC.minimize(); } } this shouldn't fail so soon, right? i'm freeing the memory, so... it still dying on 1,887,436,800. 1.7GB and that's all? this can't be true, i have 3GB of free RAM (with 1.2GB used) and 8GB of unused swap. and yes, it consumed all of the process address space again.734003200 address space" (yes, i'm on 32-bit system, GNU/Linux). the question is: am i doing something wrong here? how can i force GC to stop eating my address space and reuse what it already has?Sure: just make the GC precise, not conservative. ;) With current GC implementation and array this big chances of having a word on the stack that looks like a pointer to it and prevents it from being collected are almost 100%. Just don't store big arrays in GC heap or switch to 64 bits where the problem is not that bad since address space is much larger and chances of false pointers are much smaller.
Nov 12 2014
On Wed, 12 Nov 2014 12:42:10 +0000 Matthias Bentrup via Digitalmars-d <digitalmars-d puremagic.com> wrote:On Wednesday, 12 November 2014 at 12:30:15 UTC, ketmar via=20 Digitalmars-d wrote:i know it. what i can't get is why D allocates more and more address space with each 'new'. what i expecting is address space consumption on par with 'size', but it grows alot faster. seems that i should either read GC code to see what's going on (oh, boring!) or write memory region dumper (it's funnier). i bet that something is wrong with GC memory manager though, but can't prove it for now.On Wed, 12 Nov 2014 12:05:25 +0000 thedeemon via Digitalmars-d <digitalmars-d puremagic.com> wrote:=20 On Linux/x86 you have only 3 GB virtual address space, and this=20 has to include the program code + all loaded libraries too. Check=20 out /proc/<pid>/maps, to see where the dlls are loaded, and look=20 at the largest chunk of free space available. That is the=20 theoretical limit that could be allocated.On Wednesday, 12 November 2014 at 11:05:11 UTC, ketmar via=20 Digitalmars-d wrote:but 'mkay, let's change the sample a little: import core.memory; import std.stdio; void main () { uint size =3D 1024*1024*300; for (;;) { auto buf =3D new ubyte[](size); writefln("%s", size); size +=3D 1024*1024*100; GC.free(GC.addrOf(buf.ptr)); buf =3D null; GC.collect(); GC.minimize(); } } this shouldn't fail so soon, right? i'm freeing the memory,=20 so... it still dying on 1,887,436,800. 1.7GB and that's all? this can't=20 be true, i have 3GB of free RAM (with 1.2GB used) and 8GB of unused=20 swap. and yes, it consumed all of the process address space again.734003200 address space" (yes, i'm on 32-bit system, GNU/Linux). the question is: am i doing something wrong here? how can i=20 force GC to stop eating my address space and reuse what it=20 already has?=20 Sure: just make the GC precise, not conservative. ;) With current GC implementation and array this big chances of=20 having a word on the stack that looks like a pointer to it and=20 prevents it from being collected are almost 100%. Just don't=20 store big arrays in GC heap or switch to 64 bits where the=20 problem is not that bad since address space is much larger and=20 chances of false pointers are much smaller.
Nov 12 2014
On Wednesday, 12 November 2014 at 12:30:15 UTC, ketmar via Digitalmars-d wrote:this shouldn't fail so soon, right? i'm freeing the memory, so... it still dying on 1,887,436,800. 1.7GB and that's all? this can't be true, i have 3GB of free RAM (with 1.2GB used) and 8GB of unused swap. and yes, it consumed all of the process address space again.Maybe you fragmented the heap and don't have 1.7GB of contiguous memory?
Nov 12 2014
On Wed, 12 Nov 2014 15:24:08 +0000 Kagamin via Digitalmars-d <digitalmars-d puremagic.com> wrote:On Wednesday, 12 November 2014 at 12:30:15 UTC, ketmar via=20 Digitalmars-d wrote:i gave two example programs, which demonstrates the effect, and they aren't excerpts. the only allocating `writef` can be removed too, but the effect stays. so heap fragmentation from other allocations can't be the issue.this shouldn't fail so soon, right? i'm freeing the memory,=20 so... it still dying on 1,887,436,800. 1.7GB and that's all? this can't=20 be true, i have 3GB of free RAM (with 1.2GB used) and 8GB of unused=20 swap. and yes, it consumed all of the process address space again.=20 Maybe you fragmented the heap and don't have 1.7GB of contiguous=20 memory?
Nov 12 2014
On Wednesday, 12 November 2014 at 15:36:48 UTC, ketmar via Digitalmars-d wrote:so heap fragmentation from other allocations can't be the issue.Why do you think so? Try to go in opposite direction: start from 700MB and decrease allocation size.
Nov 12 2014
On Wed, 12 Nov 2014 16:23:01 +0000 Kagamin via Digitalmars-d <digitalmars-d puremagic.com> wrote:On Wednesday, 12 November 2014 at 15:36:48 UTC, ketmar via=20 Digitalmars-d wrote:i mean "from allocations in other places of the progam", not the "previous allocations in this code". sorry.so heap fragmentation from other allocations can't be the issue.=20 Why do you think so? Try to go in opposite direction: start from 700MB and decrease=20 allocation size.
Nov 12 2014
On Wed, 12 Nov 2014 12:05:25 +0000 thedeemon via Digitalmars-d <digitalmars-d puremagic.com> wrote:On Wednesday, 12 November 2014 at 11:05:11 UTC, ketmar via=20 Digitalmars-d wrote:for information: yes, RES is jumping high and low as it should. but VIRT is steadily growing until there is no more address space available. so the problem is clearly not in false pointers this time.734003200 address space" (yes, i'm on 32-bit system, GNU/Linux). the question is: am i doing something wrong here? how can i=20 force GC to stop eating my address space and reuse what it=20 already has?=20 Sure: just make the GC precise, not conservative. ;) With current GC implementation and array this big chances of=20 having a word on the stack that looks like a pointer to it and=20 prevents it from being collected are almost 100%. Just don't=20 store big arrays in GC heap or switch to 64 bits where the=20 problem is not that bad since address space is much larger and=20 chances of false pointers are much smaller.
Nov 12 2014
On Wednesday, 12 November 2014 at 11:05:11 UTC, ketmar via Digitalmars-d wrote:the question is: am i doing something wrong here? how can i force GC to stop eating my address space and reuse what it already has?Try to allocate the arrays with NO_SCAN flag.
Nov 12 2014
On Wed, 12 Nov 2014 15:19:51 +0000 Kagamin via Digitalmars-d <digitalmars-d puremagic.com> wrote:On Wednesday, 12 November 2014 at 11:05:11 UTC, ketmar via=20 Digitalmars-d wrote:why this must make any difference in the demonstrated cases? ubyte arrays are initialized to zeroes, so they can't contain false pointers.the question is: am i doing something wrong here? how can i=20 force GC to stop eating my address space and reuse what it already has?=20 Try to allocate the arrays with NO_SCAN flag.
Nov 12 2014
On 11/12/14 10:19 AM, Kagamin wrote:On Wednesday, 12 November 2014 at 11:05:11 UTC, ketmar via Digitalmars-d wrote:Really that shouldn't matter. The arrays should all be 0-initialized. -Stevethe question is: am i doing something wrong here? how can i force GC to stop eating my address space and reuse what it already has?Try to allocate the arrays with NO_SCAN flag.
Nov 12 2014
On Wed, 12 Nov 2014 15:19:51 +0000 Kagamin via Digitalmars-d <digitalmars-d puremagic.com> wrote:On Wednesday, 12 November 2014 at 11:05:11 UTC, ketmar via=20 Digitalmars-d wrote:btw, compiler is smart enough to allocate array with NO_SCAN flag, i checked this with `GC.getAttr()`.the question is: am i doing something wrong here? how can i=20 force GC to stop eating my address space and reuse what it already has?=20 Try to allocate the arrays with NO_SCAN flag.
Nov 12 2014
On 11/12/14 6:04 AM, ketmar via Digitalmars-d wrote:Hello. let's run this program: import core.sys.posix.unistd; import std.stdio; import core.memory; void main () { uint size = 1024*1024*300; for (;;) { auto buf = new ubyte[](size); writefln("%s", size); sleep(1); size += 1024*1024*100; buf = null; GC.collect(); GC.minimize(); } } pretty innocent, right? i even trying to help GC here. but... 314572800 419430400 524288000 629145600 734003200 core.exception.OutOfMemoryError (0) oooops. by the way, this is not actually "no more memory", this is "i'm out of address space" (yes, i'm on 32-bit system, GNU/Linux). the question is: am i doing something wrong here? how can i force GC to stop eating my address space and reuse what it already has? sure, i can use libc malloc(), refcounting, and so on, but the question remains: why GC not reusing already allocated and freed memory?I think I might know what's going on. You are continually adding 100MB to the allocation size. Memory is contiguous from the OS, but can get fragmented inside the GC. So let's say, you allocate 300MB. Fine. It needs more space from the OS, allocates it, and assigns a pool to that 300MB. Now, you add another 100MB. At this point, it can't fit into the original pool, so it allocates another 400MB. BUT, it doesn't merge the 300MB into that (I don't think), so when it adds another 100MB, it has a 300MB space, and a 400MB space, neither of which can hold 500MB. And it goes on and on. Keep in mind also that it is a frequent error that people make to set a pointer to null and expect the data will be collected. For example, buf could still be in a register. I would be interested in how much memory the GC has vs. how much is actually used. -Steve
Nov 12 2014
On Wed, 12 Nov 2014 10:51:31 -0500 Steven Schveighoffer via Digitalmars-d <digitalmars-d puremagic.com> wrote:On 11/12/14 6:04 AM, ketmar via Digitalmars-d wrote:=20Hello. let's run this program: import core.sys.posix.unistd; import std.stdio; import core.memory; void main () { uint size =3D 1024*1024*300; for (;;) { auto buf =3D new ubyte[](size); writefln("%s", size); sleep(1); size +=3D 1024*1024*100; buf =3D null; GC.collect(); GC.minimize(); } } pretty innocent, right? i even trying to help GC here. but... 314572800 419430400 524288000 629145600 734003200 core.exception.OutOfMemoryError (0) oooops. by the way, this is not actually "no more memory", this is "i'm out of address space" (yes, i'm on 32-bit system, GNU/Linux). the question is: am i doing something wrong here? how can i force GC to stop eating my address space and reuse what it already has? sure, i can use libc malloc(), refcounting, and so on, but the question remains: why GC not reusing already allocated and freed memory?=20 I think I might know what's going on. =20 You are continually adding 100MB to the allocation size. Memory is=20 contiguous from the OS, but can get fragmented inside the GC. =20 So let's say, you allocate 300MB. Fine. It needs more space from the OS,=allocates it, and assigns a pool to that 300MB. Now, you add another=20 100MB. At this point, it can't fit into the original pool, so it=20 allocates another 400MB. BUT, it doesn't merge the 300MB into that (I=20 don't think), so when it adds another 100MB, it has a 300MB space, and a==20400MB space, neither of which can hold 500MB. And it goes on and on.=20 Keep in mind also that it is a frequent error that people make to set a=20 pointer to null and expect the data will be collected. For example, buf=20 could still be in a register. =20 I would be interested in how much memory the GC has vs. how much is=20 actually used.i posted the second samle where i'm doing `GC.free()` to reclaim memory. as i said, RES is jumping between "almost nothing" and "several GB", as sample allocates and frees. but VIRT is growing constantly. i believe that GC just can't merge segments, so it keep asking for more and more address space for new segments, leaving old ones unused and unmerged. this way GC has alot of free memory, but when it can't allocate another segment, it throws "out of memory error". if i'll use libc malloc() for allocating, everything works as i expected: address space consumtion is on par with allocation size.
Nov 12 2014
On Wednesday, 12 November 2014 at 16:06:32 UTC, ketmar via Digitalmars-d wrote:if i'll use libc malloc() for allocating, everything works as i expected: address space consumtion is on par with allocation size.The gc uses C's calloc rather implementing memory handling itself using the OS so you get fragmentation: https://github.com/D-Programming-Language/druntime/blob/master/src/gc/gc.d#L2223
Nov 12 2014
On Wed, 12 Nov 2014 16:13:39 +0000 via Digitalmars-d <digitalmars-d puremagic.com> wrote:On Wednesday, 12 November 2014 at 16:06:32 UTC, ketmar via=20 Digitalmars-d wrote:d#L2223 hm. my bad, i was checking malloc() with C program and it happens to use custom allocator. i just carefully re-checked it and it really works the same as D GC. sorry. so this seems to be libc memory manager fault after all. sorry for the noise.if i'll use libc malloc() for allocating, everything works as i expected: address space consumtion is on par with allocation=20 size.=20 The gc uses C's calloc rather implementing memory handling itself=20 using the OS so you get fragmentation: =20 https://github.com/D-Programming-Language/druntime/blob/master/src/gc/gc.=
Nov 12 2014
On Wed, 12 Nov 2014 16:13:39 +0000 via Digitalmars-d <digitalmars-d puremagic.com> wrote:On Wednesday, 12 November 2014 at 16:06:32 UTC, ketmar via=20 Digitalmars-d wrote:d#L2223 ah, and sorry once again, i was tired. ;-) the C program stops at 2,936,012,800, which is much more realistic. i checked three times and it's really using libc `malloc()` now. so libc malloc is perfectly able to merge segments, while D GC is not.if i'll use libc malloc() for allocating, everything works as i expected: address space consumtion is on par with allocation=20 size.=20 The gc uses C's calloc rather implementing memory handling itself=20 using the OS so you get fragmentation: =20 https://github.com/D-Programming-Language/druntime/blob/master/src/gc/gc.=
Nov 12 2014
On 11/12/14 11:06 AM, ketmar via Digitalmars-d wrote:i posted the second samle where i'm doing `GC.free()` to reclaim memory. as i said, RES is jumping between "almost nothing" and "several GB", as sample allocates and frees. but VIRT is growing constantly. i believe that GC just can't merge segments, so it keep asking for more and more address space for new segments, leaving old ones unused and unmerged. this way GC has alot of free memory, but when it can't allocate another segment, it throws "out of memory error".Yes, this is what I think is happening.if i'll use libc malloc() for allocating, everything works as i expected: address space consumtion is on par with allocation size.I don't know the internals of C malloc. But I think it should be possible to make D merge segments when it needs to. One thing I am curious about -- it needs to allocate space to deal with metadata in the heap. That data should be moveable, but I bet it doesn't get moved. That may be why it can't merge segments. -Steve
Nov 12 2014
On Wed, 12 Nov 2014 11:20:48 -0500 Steven Schveighoffer via Digitalmars-d <digitalmars-d puremagic.com> wrote:One thing I am curious about -- it needs to allocate space to deal with=20 metadata in the heap. That data should be moveable, but I bet it doesn't==20get moved. That may be why it can't merge segments.looks like a good GC improvement. ;-)
Nov 12 2014
On Wednesday, 12 November 2014 at 16:20:48 UTC, Steven Schveighoffer wrote:I don't know the internals of C malloc. But I think it should be possible to make D merge segments when it needs to.Yes, but then D must provide a malloc replacement.
Nov 12 2014
It's been a while since I Dded this, but I think the GC will effectively call minimize after collecting, so any collected large allocations should be returned to the OS. Allocations larger than 4K get their own dedicated pool, so fragmentation shouldn't come into play here.
Nov 12 2014
Try following the big allocation with a really small allocation to clear out any registers that may be referencing the large block.
Nov 12 2014
On Wed, 12 Nov 2014 16:40:10 +0000 Sean Kelly via Digitalmars-d <digitalmars-d puremagic.com> wrote:Try following the big allocation with a really small allocation=20 to clear out any registers that may be referencing the large=20 block.but this clearly not an issue with sample which does `GC.free()`, and it stops at 1.7GB, while C sample does the same and stops at 2.9GB.
Nov 12 2014
On Wednesday, 12 November 2014 at 16:47:47 UTC, ketmar via Digitalmars-d wrote:On Wed, 12 Nov 2014 16:40:10 +0000 Sean Kelly via Digitalmars-d <digitalmars-d puremagic.com> wrote:You should get debuginfo by compiling the runtime with the PRINTF debugging flag set: https://github.com/D-Programming-Language/druntime/blob/master/src/gc/gc.d#L20Try following the big allocation with a really small allocation to clear out any registers that may be referencing the large block.but this clearly not an issue with sample which does `GC.free()`, and it stops at 1.7GB, while C sample does the same and stops at 2.9GB.
Nov 12 2014
On Wednesday, 12 November 2014 at 16:47:47 UTC, ketmar via Digitalmars-d wrote:but this clearly not an issue with sample which does `GC.free()`, and it stops at 1.7GB, while C sample does the same and stops at 2.9GB.GC probably allocates some small blocks for pools and other data. If it gets in a middle of address space, that can cause fragmentation too. Try to add small allocations to the C sample too.
Nov 13 2014