www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Garbage Collection Pitfall in C++ but not in D?

reply "akaz" <nemo utopia.com> writes:
Hi,

  Reading about the C++11, I stumbled upon this:

  http://www2.research.att.com/~bs/C++0xFAQ.html#gc-abi

  Specifically (quote):

  	int* p = new int;
	p+=10;
	// ... collector may run here ...
	p-=10;
	*p = 10;	// can we be sure that the int is still there?

  How does the D garbage collector solves (answers) that?

Thank you.
Jul 06 2012
next sibling parent reply Denis Shelomovskij <verylonglogin.reg gmail.com> writes:
06.07.2012 17:43, akaz пишет:
 Hi,

   Reading about the C++11, I stumbled upon this:

   http://www2.research.att.com/~bs/C++0xFAQ.html#gc-abi

   Specifically (quote):

       int* p = new int;
      p+=10;
      // ... collector may run here ...
      p-=10;
      *p = 10;    // can we be sure that the int is still there?

   How does the D garbage collector solves (answers) that?

 Thank you.

If you are interested in D read this first: http://dlang.org/garbage.html You can find there e.g.:
 Do not add or subtract an offset to a pointer such that the result 

allocated. So `p+=10;` is already "undefined behavior". -- Денис В. Шеломовский Denis V. Shelomovskij
Jul 06 2012
parent reply =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 06-07-2012 16:07, Denis Shelomovskij wrote:
 06.07.2012 17:43, akaz пишет:
 Hi,

 Reading about the C++11, I stumbled upon this:

 http://www2.research.att.com/~bs/C++0xFAQ.html#gc-abi

 Specifically (quote):

 int* p = new int;
 p+=10;
 // ... collector may run here ...
 p-=10;
 *p = 10; // can we be sure that the int is still there?

 How does the D garbage collector solves (answers) that?

 Thank you.

If you are interested in D read this first: http://dlang.org/garbage.html You can find there e.g.: > Do not add or subtract an offset to a pointer such that the result points outside of the bounds of the garbage collected object originally allocated. So `p+=10;` is already "undefined behavior".

I'll just add: Handling this case is basically impossible to do sanely. You can't really know what some pointer off the bounds of a managed memory region is based on. It could literally be based on any memory region in the entire program. You could do heuristics of course, but . . . -- Alex Rønne Petersen alex lycus.org http://lycus.org
Jul 06 2012
next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 07/06/2012 05:39 PM, Alex Rønne Petersen wrote:
 On 06-07-2012 16:07, Denis Shelomovskij wrote:
 06.07.2012 17:43, akaz пишет:
 Hi,

 Reading about the C++11, I stumbled upon this:

 http://www2.research.att.com/~bs/C++0xFAQ.html#gc-abi

 Specifically (quote):

 int* p = new int;
 p+=10;
 // ... collector may run here ...
 p-=10;
 *p = 10; // can we be sure that the int is still there?

 How does the D garbage collector solves (answers) that?

 Thank you.

If you are interested in D read this first: http://dlang.org/garbage.html You can find there e.g.:
 Do not add or subtract an offset to a pointer such that the result

allocated. So `p+=10;` is already "undefined behavior".

I'll just add: Handling this case is basically impossible to do sanely. You can't really know what some pointer off the bounds of a managed memory region is based on. It could literally be based on any memory region in the entire program. You could do heuristics of course, but . . .

You could run the program in a dedicated VM. :)
Jul 06 2012
prev sibling next sibling parent =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 06-07-2012 22:07, akaz wrote:
 On Friday, 6 July 2012 at 15:39:40 UTC, Alex Rønne Petersen wrote:
 On 06-07-2012 16:07, Denis Shelomovskij wrote:
 06.07.2012 17:43, akaz пишет:
 Hi,

 Reading about the C++11, I stumbled upon this:


I'll just add: Handling this case is basically impossible to do sanely. You can't really know what some pointer off the bounds of a managed memory region is based on. It could literally be based on any memory region in the entire program. You could do heuristics of course, but . . .

Is not possible to make use of addRange() and removeRange() to "block" the GC over a specified time/range?

Those just add root ranges, i.e. areas of memory that will be scanned for [interior] pointers during the marking phase. It still won't solve the problem. You can disable the GC entirely with GC.disable(). -- Alex Rønne Petersen alex lycus.org http://lycus.org
Jul 06 2012
prev sibling parent reply Simon <s.d.hammett gmail.com> writes:
On 06/07/2012 16:39, Alex Rønne Petersen wrote:
 On 06-07-2012 16:07, Denis Shelomovskij wrote:
 06.07.2012 17:43, akaz пишет:
 Hi,

 Reading about the C++11, I stumbled upon this:

 http://www2.research.att.com/~bs/C++0xFAQ.html#gc-abi

 Specifically (quote):

 int* p = new int;
 p+=10;
 // ... collector may run here ...
 p-=10;
 *p = 10; // can we be sure that the int is still there?

 How does the D garbage collector solves (answers) that?

 Thank you.

If you are interested in D read this first: http://dlang.org/garbage.html You can find there e.g.: > Do not add or subtract an offset to a pointer such that the result points outside of the bounds of the garbage collected object originally allocated. So `p+=10;` is already "undefined behavior".

I'll just add: Handling this case is basically impossible to do sanely. You can't really know what some pointer off the bounds of a managed memory region is based on. It could literally be based on any memory region in the entire program. You could do heuristics of course, but . . .

Never mind what D says, even in C/C++ just doing the p += 10 is invalid. Creating a pointer that points at invalid memory is just as wrong as dereferencing it would be. The possible crash on the dereference of the pointer is just the symptom of the under lying bug. With visual studio & their implementation of the std containers, in a debug build if you increment/decrement or otherwise create an iterator that's outside the allowable bounds of the container you get a runtime assertion failure. It's a very handy feature and uncovered a load of bugs when we upgraded to VS 2008. -- My enormous talent is exceeded only by my outrageous laziness. http://www.ssTk.co.uk
Jul 06 2012
parent =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 07-07-2012 05:02, akaz wrote:
 On Friday, 6 July 2012 at 21:10:56 UTC, Simon wrote:
 On 06/07/2012 16:39, Alex Rønne Petersen wrote:
 On 06-07-2012 16:07, Denis Shelomovskij wrote:
 06.07.2012 17:43, akaz пишет:


Never mind what D says, even in C/C++ just doing the p += 10 is invalid. Creating a pointer that points at invalid memory is just as wrong as dereferencing it would be.

Actually, p+10 could still be valid memory. OTOH, the other case that's given on the original page is this one: int* p = new int; int x = reinterpret_cast<int>(p); // non-portable

This will fail on a 64-bit system, but
      p=0;

will work fine on a 32-bit system since the pointer is still conservatively visible in x. Not many GCs scan the stack and static data segments precisely, since it's often not worth it. This is the case for D's GC. (Make x size_t and it will always work. In D, anyway.)
      // ... collector may run here ...
      p = reinterpret_cast<int*>(x);
      *p = 10;    // can we be sure that the int is still there?

Yes (under the conditions described above).
 So, the pointer could sometimes simply disappear temporarily, without
 becoming invalid (well, p==NULL is somewhat invalid, bt it could have
 been p=&q). Just some allocated memory is no longer referenced for the
 time being and this could trigger the GC without protection (except
 disabling GC for the entire application).

 Won't some functions doing just what addRange() and removeRange() do
 solve that kind of problem (if necessary)? That means, forbidding the GC
 to scan some memory area for some time?

I think you misunderstand what those functions do. See my earlier reply. addRange() merely lets the GC know that a region of memory may contain pointers into GC memory on every machine word boundary of the region. This region of memory is scanned conservatively, i.e. an integer inside the region which looks like a pointer will keep the memory that the pointer value would point into alive. The stack is always scanned by the GC, and in D's case, conservatively. removeRange() just removes a root memory range; no magic there. -- Alex Rønne Petersen alex lycus.org http://lycus.org
Jul 06 2012
prev sibling next sibling parent "akaz" <nemo utopia.com> writes:
 If you are interested in D read this first:
 http://dlang.org/garbage.html

 You can find there e.g.:
 Do not add or subtract an offset to a pointer such that the

object originally allocated. So `p+=10;` is already "undefined behavior".

Thank you, this clear the issue.
Jul 06 2012
prev sibling next sibling parent "akaz" <nemo utopia.com> writes:
On Friday, 6 July 2012 at 15:39:40 UTC, Alex Rønne Petersen 
wrote:
 On 06-07-2012 16:07, Denis Shelomovskij wrote:
 06.07.2012 17:43, akaz пишет:
 Hi,

 Reading about the C++11, I stumbled upon this:


I'll just add: Handling this case is basically impossible to do sanely. You can't really know what some pointer off the bounds of a managed memory region is based on. It could literally be based on any memory region in the entire program. You could do heuristics of course, but . . .

Is not possible to make use of addRange() and removeRange() to "block" the GC over a specified time/range?
Jul 06 2012
prev sibling next sibling parent "akaz" <nemo utopia.com> writes:
On Friday, 6 July 2012 at 21:10:56 UTC, Simon wrote:
 On 06/07/2012 16:39, Alex Rønne Petersen wrote:
 On 06-07-2012 16:07, Denis Shelomovskij wrote:
 06.07.2012 17:43, akaz пишет:


Never mind what D says, even in C/C++ just doing the p += 10 is invalid. Creating a pointer that points at invalid memory is just as wrong as dereferencing it would be.

Actually, p+10 could still be valid memory. OTOH, the other case that's given on the original page is this one: int* p = new int; int x = reinterpret_cast<int>(p); // non-portable p=0; // ... collector may run here ... p = reinterpret_cast<int*>(x); *p = 10; // can we be sure that the int is still there? So, the pointer could sometimes simply disappear temporarily, without becoming invalid (well, p==NULL is somewhat invalid, bt it could have been p=&q). Just some allocated memory is no longer referenced for the time being and this could trigger the GC without protection (except disabling GC for the entire application). Won't some functions doing just what addRange() and removeRange() do solve that kind of problem (if necessary)? That means, forbidding the GC to scan some memory area for some time?
Jul 06 2012
prev sibling next sibling parent "akaz" <nemo utopia.com> writes:
 Won't some functions doing just what addRange() and 
 removeRange() do solve that kind of problem (if necessary)? 
 That means, forbidding the GC to scan some memory area for some 
 time?

Like their C++11 counterparts: void declare_reachable(void* p); // the region of memory starting at p // (and allocated by some allocator // operation which remembers its size) // must not be collected template<class T> T* undeclare_reachable(T* p); void declare_no_pointers(char* p, size_t n); // p[0..n] holds no pointers void undeclare_no_pointers(char* p, size_t n);
Jul 06 2012
prev sibling parent "David Nadlinger" <see klickverbot.at> writes:
On Saturday, 7 July 2012 at 03:02:07 UTC, akaz wrote:
 On Friday, 6 July 2012 at 21:10:56 UTC, Simon wrote:
 On 06/07/2012 16:39, Alex Rønne Petersen wrote:
 Never mind what D says, even in C/C++ just doing the p += 10 
 is invalid.

 Creating a pointer that points at invalid memory is just as 
 wrong as dereferencing it would be.

Actually, p+10 could still be valid memory.

Not if p+10 was not allocated as part of the same block as p was (i.e. if p is the result of »new int«, it is always illegal). It might »physically« work with the common C/D implementations if another chunk of your memory is at that address, just as *(cast(int*)0xdeadbeef) could potentially work, but it is undefined behavior by the rules of the language. If the compiler can prove that when allocating the memory p points at, p + 10 was not allocated as well, it would even be free to directly replace a read from that pointer with an undefined value. David
Jul 07 2012