www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Likely closure memory corruption

reply "deadalnix" <deadalnix gmail.com> writes:
auto objectSource = new FileSource("../libs/object.d");
auto object = lex!((line, index, length) {
	import std.stdio;
	writeln("new location 2 ! ", cast(void*) objectSource);
	
	return Location(objectSource, line, index, length);
})(objectSource.content);

Running this, I see that at some point, objectSource is changed. 
The output is
new location 2 ! 7F494EEC1D40
new location 2 ! 7F494EEC1D40
...
new location 2 ! 7F494EEC1D40
new location 2 ! 7F494EEC2EA0

Obviously, the program segfault soon after that.

It sounds like some memory corruption occurs under the hood. What 
can I do to work around that bug and to help solving it ?
Mar 03 2013
next sibling parent reply "Iain Buclaw" <ibuclaw ubuntu.com> writes:
On Sunday, 3 March 2013 at 16:48:32 UTC, deadalnix wrote:
 auto objectSource = new FileSource("../libs/object.d");
 auto object = lex!((line, index, length) {
 	import std.stdio;
 	writeln("new location 2 ! ", cast(void*) objectSource);
 	
 	return Location(objectSource, line, index, length);
 })(objectSource.content);

 Running this, I see that at some point, objectSource is 
 changed. The output is
 new location 2 ! 7F494EEC1D40
 new location 2 ! 7F494EEC1D40
 ...
 new location 2 ! 7F494EEC1D40
 new location 2 ! 7F494EEC2EA0

 Obviously, the program segfault soon after that.

 It sounds like some memory corruption occurs under the hood. 
 What can I do to work around that bug and to help solving it ?
Is this a dmd thing, or does it affect other compilers? (don't have a laptop to test at the moment). Regards Iain
Mar 03 2013
parent reply "deadalnix" <deadalnix gmail.com> writes:
On Sunday, 3 March 2013 at 16:51:15 UTC, Iain Buclaw wrote:
 Is this a dmd thing, or does it affect other compilers? (don't 
 have a laptop to test at the moment).
I don't know, the code is not compilable with 2.060 (or 2.061) so I'm unable to test how gdc does on that one.
Mar 03 2013
parent reply Iain Buclaw <ibuclaw ubuntu.com> writes:
On Mar 3, 2013 5:01 PM, "deadalnix" <deadalnix gmail.com> wrote:
 On Sunday, 3 March 2013 at 16:51:15 UTC, Iain Buclaw wrote:
 Is this a dmd thing, or does it affect other compilers? (don't have a
laptop to test at the moment).

 I don't know, the code is not compilable with 2.060 (or 2.061) so I'm
unable to test how gdc does on that one. GDC has been on 2.062 frontend for at least a fortnight. I know I don't announce these things, but am still busy working on next set of refactoring. Regards -- Iain Buclaw *(p < e ? p++ : p) = (c & 0x0f) + '0';
Mar 03 2013
next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Sunday, 3 March 2013 at 17:03:24 UTC, Iain Buclaw wrote:
 On Mar 3, 2013 5:01 PM, "deadalnix" <deadalnix gmail.com> wrote:
 On Sunday, 3 March 2013 at 16:51:15 UTC, Iain Buclaw wrote:
 Is this a dmd thing, or does it affect other compilers? 
 (don't have a
laptop to test at the moment).

 I don't know, the code is not compilable with 2.060 (or 2.061) 
 so I'm
unable to test how gdc does on that one. GDC has been on 2.062 frontend for at least a fortnight. I know I don't announce these things, but am still busy working on next set of refactoring. Regards
OK, I'll give it a try tomorrow after some happy compilation.
Mar 03 2013
prev sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
On Sunday, 3 March 2013 at 17:03:24 UTC, Iain Buclaw wrote:
 On Mar 3, 2013 5:01 PM, "deadalnix" <deadalnix gmail.com> wrote:
 On Sunday, 3 March 2013 at 16:51:15 UTC, Iain Buclaw wrote:
 Is this a dmd thing, or does it affect other compilers? 
 (don't have a
laptop to test at the moment).

 I don't know, the code is not compilable with 2.060 (or 2.061) 
 so I'm
unable to test how gdc does on that one. GDC has been on 2.062 frontend for at least a fortnight. I know I don't announce these things, but am still busy working on next set of refactoring. Regards
Tested, the corruption don't appear with GDC. Good job !
Mar 03 2013
parent Iain Buclaw <ibuclaw ubuntu.com> writes:
On Mar 4, 2013 2:11 AM, "deadalnix" <deadalnix gmail.com> wrote:
 On Sunday, 3 March 2013 at 17:03:24 UTC, Iain Buclaw wrote:
 On Mar 3, 2013 5:01 PM, "deadalnix" <deadalnix gmail.com> wrote:
 On Sunday, 3 March 2013 at 16:51:15 UTC, Iain Buclaw wrote:
 Is this a dmd thing, or does it affect other compilers? (don't have a
laptop to test at the moment).

 I don't know, the code is not compilable with 2.060 (or 2.061) so I'm
unable to test how gdc does on that one. GDC has been on 2.062 frontend for at least a fortnight. I know I don't announce these things, but am still busy working on next set of refactoring. Regards
Tested, the corruption don't appear with GDC. Good job !
Excellente! :o) Regards -- Iain Buclaw *(p < e ? p++ : p) = (c & 0x0f) + '0';
Mar 04 2013
prev sibling next sibling parent reply "Maxim Fomin" <maxim maxim-fomin.ru> writes:
On Sunday, 3 March 2013 at 16:48:32 UTC, deadalnix wrote:
 auto objectSource = new FileSource("../libs/object.d");
 auto object = lex!((line, index, length) {
 	import std.stdio;
 	writeln("new location 2 ! ", cast(void*) objectSource);
 	
 	return Location(objectSource, line, index, length);
 })(objectSource.content);

 Running this, I see that at some point, objectSource is 
 changed. The output is
 new location 2 ! 7F494EEC1D40
 new location 2 ! 7F494EEC1D40
 ...
 new location 2 ! 7F494EEC1D40
 new location 2 ! 7F494EEC2EA0

 Obviously, the program segfault soon after that.

 It sounds like some memory corruption occurs under the hood. 
 What can I do to work around that bug and to help solving it ?
It is unlikely that the particular closure corrupts data since at three times the address is correct. Closure bugs are typically revealed as wrong function code, so there would no difference between addresses. Something else in scope of "objectSource" may corrupt it. You can try to pass explicitly types of closure parameters, rewrite closure as a nested function - it used to mitigate some closure bugs in the past. Without compilable source I cannot say anything more.
Mar 03 2013
parent reply "deadalnix" <deadalnix gmail.com> writes:
On Sunday, 3 March 2013 at 17:27:52 UTC, Maxim Fomin wrote:
 On Sunday, 3 March 2013 at 16:48:32 UTC, deadalnix wrote:
 auto objectSource = new FileSource("../libs/object.d");
 auto object = lex!((line, index, length) {
 	import std.stdio;
 	writeln("new location 2 ! ", cast(void*) objectSource);
 	
 	return Location(objectSource, line, index, length);
 })(objectSource.content);

 Running this, I see that at some point, objectSource is 
 changed. The output is
 new location 2 ! 7F494EEC1D40
 new location 2 ! 7F494EEC1D40
 ...
 new location 2 ! 7F494EEC1D40
 new location 2 ! 7F494EEC2EA0

 Obviously, the program segfault soon after that.

 It sounds like some memory corruption occurs under the hood. 
 What can I do to work around that bug and to help solving it ?
It is unlikely that the particular closure corrupts data since at three times the address is correct. Closure bugs are typically revealed as wrong function code, so there would no difference between addresses. Something else in scope of "objectSource" may corrupt it. You can try to pass explicitly types of closure parameters, rewrite closure as a nested function - it used to mitigate some closure bugs in the past. Without compilable source I cannot say anything more.
The problem here is that the codebase involved is rather huge. Do you have an advice to reduce the issue ?
Mar 03 2013
parent reply "Maxim Fomin" <maxim maxim-fomin.ru> writes:
On Monday, 4 March 2013 at 02:10:12 UTC, deadalnix wrote:
 On Sunday, 3 March 2013 at 17:27:52 UTC, Maxim Fomin wrote:
 On Sunday, 3 March 2013 at 16:48:32 UTC, deadalnix wrote:
 auto objectSource = new FileSource("../libs/object.d");
 auto object = lex!((line, index, length) {
 	import std.stdio;
 	writeln("new location 2 ! ", cast(void*) objectSource);
 	
 	return Location(objectSource, line, index, length);
 })(objectSource.content);

 Running this, I see that at some point, objectSource is 
 changed. The output is
 new location 2 ! 7F494EEC1D40
 new location 2 ! 7F494EEC1D40
 ...
 new location 2 ! 7F494EEC1D40
 new location 2 ! 7F494EEC2EA0

 Obviously, the program segfault soon after that.

 It sounds like some memory corruption occurs under the hood. 
 What can I do to work around that bug and to help solving it ?
It is unlikely that the particular closure corrupts data since at three times the address is correct. Closure bugs are typically revealed as wrong function code, so there would no difference between addresses. Something else in scope of "objectSource" may corrupt it. You can try to pass explicitly types of closure parameters, rewrite closure as a nested function - it used to mitigate some closure bugs in the past. Without compilable source I cannot say anything more.
The problem here is that the codebase involved is rather huge. Do you have an advice to reduce the issue ?
No specific advice except for dropping everything except FileSource definition and lex template. By the way, what is objectSource.content? A tuple of line, index, length?
Mar 03 2013
parent "deadalnix" <deadalnix gmail.com> writes:
On Monday, 4 March 2013 at 07:10:20 UTC, Maxim Fomin wrote:
 No specific advice except for dropping everything except 
 FileSource definition and lex template. By the way, what is 
 objectSource.content? A tuple of line, index, length?
The problem is that lex is massive : it is a whole lexer. objectSource.content is simply a string.
Mar 03 2013
prev sibling parent reply "Don" <turnyourkidsintocash nospam.com> writes:
On Sunday, 3 March 2013 at 16:48:32 UTC, deadalnix wrote:
...
 Obviously, the program segfault soon after that.

 It sounds like some memory corruption occurs under the hood. 
 What can I do to work around that bug and to help solving it ?
Have you compiled this with the latest gitHEAD ? Several very nasty wrong-code bugs were fixed very recently (eg, bug 9568).
Mar 04 2013
parent reply "deadalnix" <deadalnix gmail.com> writes:
On Monday, 4 March 2013 at 10:55:58 UTC, Don wrote:
 On Sunday, 3 March 2013 at 16:48:32 UTC, deadalnix wrote:
 ...
 Obviously, the program segfault soon after that.

 It sounds like some memory corruption occurs under the hood. 
 What can I do to work around that bug and to help solving it ?
Have you compiled this with the latest gitHEAD ? Several very nasty wrong-code bugs were fixed very recently (eg, bug 9568).
No luck. I just tested it and it doesn't work.
Mar 04 2013
parent reply "deadalnix" <deadalnix gmail.com> writes:
On Monday, 4 March 2013 at 14:21:11 UTC, deadalnix wrote:
 On Monday, 4 March 2013 at 10:55:58 UTC, Don wrote:
 On Sunday, 3 March 2013 at 16:48:32 UTC, deadalnix wrote:
 ...
 Obviously, the program segfault soon after that.

 It sounds like some memory corruption occurs under the hood. 
 What can I do to work around that bug and to help solving it ?
Have you compiled this with the latest gitHEAD ? Several very nasty wrong-code bugs were fixed very recently (eg, bug 9568).
No luck. I just tested it and it doesn't work.
I did some investigation yesterday. It seems that the frame pointer that is passed when calling the closure is not always the right one. I now can catch the problem as soon as it occurs. I'll try to reduce it to a simpler test case, but it seems really difficult.
Mar 06 2013
parent reply "deadalnix" <deadalnix gmail.com> writes:
On Thursday, 7 March 2013 at 04:21:01 UTC, deadalnix wrote:
 On Monday, 4 March 2013 at 14:21:11 UTC, deadalnix wrote:
 On Monday, 4 March 2013 at 10:55:58 UTC, Don wrote:
 On Sunday, 3 March 2013 at 16:48:32 UTC, deadalnix wrote:
 ...
 Obviously, the program segfault soon after that.

 It sounds like some memory corruption occurs under the hood. 
 What can I do to work around that bug and to help solving it 
 ?
Have you compiled this with the latest gitHEAD ? Several very nasty wrong-code bugs were fixed very recently (eg, bug 9568).
No luck. I just tested it and it doesn't work.
I did some investigation yesterday. It seems that the frame pointer that is passed when calling the closure is not always the right one. I now can catch the problem as soon as it occurs. I'll try to reduce it to a simpler test case, but it seems really difficult.
Sooooooo, I have a struct. The struct have a context pointer. I have this method : property auto save() inout { return inout(Lexer)(t, r.save, line, index); } The context pointer IS NOT COPIED. Fixed it that way : property auto save() inout { // XXX: dmd bug, context pointer isn't copied properly // doing it manualy using black magic. // Context pointer is the last element of the struct. Here in position 9. auto ret = inout(Lexer)(t, r.save, line, index); (cast(void**) &ret)[9] = (cast(void**) &this)[9]; return ret; } Very scary that I have to do that kind of things.
Mar 08 2013
next sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
On Friday, 8 March 2013 at 16:25:56 UTC, deadalnix wrote:
 Sooooooo,

 I have a struct. The struct have a context pointer. I have this 
 method :

  property
 auto save() inout {
 	return inout(Lexer)(t, r.save, line, index);
 }

 The context pointer IS NOT COPIED.

 Fixed it that way :

  property
 auto save() inout {
 	// XXX: dmd bug, context pointer isn't copied properly
 	// doing it manualy using black magic.
 	// Context pointer is the last element of the struct. Here in 
 position 9.
 	auto ret = inout(Lexer)(t, r.save, line, index);
 	(cast(void**) &ret)[9] = (cast(void**) &this)[9];
 	
 	return ret;
 }

 Very scary that I have to do that kind of things.
Is this a know bug ? Come on, this is a really bad bug, not the type of thing that can be ignored !
Mar 10 2013
next sibling parent "Maxim Fomin" <maxim maxim-fomin.ru> writes:
On Sunday, 10 March 2013 at 19:11:07 UTC, deadalnix wrote:
 On Friday, 8 March 2013 at 16:25:56 UTC, deadalnix wrote:
 Sooooooo,

 I have a struct. The struct have a context pointer. I have 
 this method :

  property
 auto save() inout {
 	return inout(Lexer)(t, r.save, line, index);
 }

 The context pointer IS NOT COPIED.

 Fixed it that way :

  property
 auto save() inout {
 	// XXX: dmd bug, context pointer isn't copied properly
 	// doing it manualy using black magic.
 	// Context pointer is the last element of the struct. Here in 
 position 9.
 	auto ret = inout(Lexer)(t, r.save, line, index);
 	(cast(void**) &ret)[9] = (cast(void**) &this)[9];
 	
 	return ret;
 }

 Very scary that I have to do that kind of things.
Is this a know bug ? Come on, this is a really bad bug, not the type of thing that can be ignored !
(i do not know such bug) This code works: auto foo() { int i; struct S { int a, b, c; int foo() { return i; } auto save() inout { return inout (S)(i,i,i); } } return S(); } void main() { auto s1 = foo(); auto s2 = s1.save(); assert(s2.foo() is 0); } So, from your description I cannot reproduce the bug. Reducing is time consuming and not pleasant activity, but it is sometimes a necessity.
Mar 10 2013
prev sibling parent Brad Roberts <braddr puremagic.com> writes:
On 3/10/2013 12:11 PM, deadalnix wrote:
 On Friday, 8 March 2013 at 16:25:56 UTC, deadalnix wrote:
 Sooooooo,

 I have a struct. The struct have a context pointer. I have this method :

  property
 auto save() inout {
     return inout(Lexer)(t, r.save, line, index);
 }

 The context pointer IS NOT COPIED.

 Fixed it that way :

  property
 auto save() inout {
     // XXX: dmd bug, context pointer isn't copied properly
     // doing it manualy using black magic.
     // Context pointer is the last element of the struct. Here in position 9.
     auto ret = inout(Lexer)(t, r.save, line, index);
     (cast(void**) &ret)[9] = (cast(void**) &this)[9];
     
     return ret;
 }

 Very scary that I have to do that kind of things.
Is this a know bug ? Come on, this is a really bad bug, not the type of thing that can be ignored !
Is it in bugzilla? The newsgroups are a bad place to be reporting bugs.
Mar 10 2013
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/8/2013 8:25 AM, deadalnix wrote:
 I have a struct. The struct have a context pointer. I have this method :

  property
 auto save() inout {
      return inout(Lexer)(t, r.save, line, index);
 }

 The context pointer IS NOT COPIED.

 Fixed it that way :

  property
 auto save() inout {
      // XXX: dmd bug, context pointer isn't copied properly
      // doing it manualy using black magic.
      // Context pointer is the last element of the struct. Here in position 9.
      auto ret = inout(Lexer)(t, r.save, line, index);
      (cast(void**) &ret)[9] = (cast(void**) &this)[9];

      return ret;
 }
1. We can't do anything with code snippets like that. A complete, compilable example is necessary. 2. Such bug reports, along with the complete example demonstrating it, needs to go into bugzilla, not here.
Mar 10 2013
next sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
On Sunday, 10 March 2013 at 21:01:11 UTC, Walter Bright wrote:
 On 3/8/2013 8:25 AM, deadalnix wrote:
 I have a struct. The struct have a context pointer. I have 
 this method :

  property
 auto save() inout {
     return inout(Lexer)(t, r.save, line, index);
 }

 The context pointer IS NOT COPIED.

 Fixed it that way :

  property
 auto save() inout {
     // XXX: dmd bug, context pointer isn't copied properly
     // doing it manualy using black magic.
     // Context pointer is the last element of the struct. Here 
 in position 9.
     auto ret = inout(Lexer)(t, r.save, line, index);
     (cast(void**) &ret)[9] = (cast(void**) &this)[9];

     return ret;
 }
1. We can't do anything with code snippets like that. A complete, compilable example is necessary. 2. Such bug reports, along with the complete example demonstrating it, needs to go into bugzilla, not here.
http://d.puremagic.com/issues/show_bug.cgi?id=9685 Isn't it an already known bug ?
Mar 10 2013
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 3/10/2013 10:46 PM, deadalnix wrote:
 http://d.puremagic.com/issues/show_bug.cgi?id=9685

 Isn't it an already known bug ?
If that's it, then good.
Mar 11 2013
prev sibling parent Iain Buclaw <ibuclaw ubuntu.com> writes:
On 11 March 2013 05:46, deadalnix <deadalnix gmail.com> wrote:

 On Sunday, 10 March 2013 at 21:01:11 UTC, Walter Bright wrote:

 On 3/8/2013 8:25 AM, deadalnix wrote:

 I have a struct. The struct have a context pointer. I have this method :

  property
 auto save() inout {
     return inout(Lexer)(t, r.save, line, index);
 }

 The context pointer IS NOT COPIED.

 Fixed it that way :

  property
 auto save() inout {
     // XXX: dmd bug, context pointer isn't copied properly
     // doing it manualy using black magic.
     // Context pointer is the last element of the struct. Here in
 position 9.
     auto ret = inout(Lexer)(t, r.save, line, index);
     (cast(void**) &ret)[9] = (cast(void**) &this)[9];

     return ret;
 }
1. We can't do anything with code snippets like that. A complete, compilable example is necessary. 2. Such bug reports, along with the complete example demonstrating it, needs to go into bugzilla, not here.
http://d.puremagic.com/issues/**show_bug.cgi?id=9685<http://d.puremagic.com/issues/show_bug.cgi?id=9685> Isn't it an already known bug ?
Works on gdc. :o) -- Iain Buclaw *(p < e ? p++ : p) = (c & 0x0f) + '0';
Mar 11 2013
prev sibling parent Iain Buclaw <ibuclaw ubuntu.com> writes:
On 10 March 2013 21:01, Walter Bright <newshound2 digitalmars.com> wrote:

 On 3/8/2013 8:25 AM, deadalnix wrote:

 I have a struct. The struct have a context pointer. I have this method :

  property
 auto save() inout {
      return inout(Lexer)(t, r.save, line, index);
 }

 The context pointer IS NOT COPIED.

 Fixed it that way :

  property
 auto save() inout {
      // XXX: dmd bug, context pointer isn't copied properly
      // doing it manualy using black magic.
      // Context pointer is the last element of the struct. Here in
 position 9.
      auto ret = inout(Lexer)(t, r.save, line, index);
      (cast(void**) &ret)[9] = (cast(void**) &this)[9];

      return ret;
 }
1. We can't do anything with code snippets like that. A complete, compilable example is necessary. 2. Such bug reports, along with the complete example demonstrating it, needs to go into bugzilla, not here.
Agreed. Can you also raise a bug for that failing gdc case. I'm in the middle of improving the custom static chain passing code in gdc, and want to try and stamp out as much as possible in the process. Thanks -- Iain Buclaw *(p < e ? p++ : p) = (c & 0x0f) + '0';
Mar 11 2013