www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - void* pointers get corrupted: D bug or misunderstanding?

reply Federico Santamorena <federico santamorena.me> writes:
Hello, I finally write on the forum for the first time to finally 
find the answer to a problem I am having:

I am calling native GTK with extern(C) function declarations 
instead of using gtkD, mainly because I just need very few 
functions and I don't want a big library that would take about 
20x time to compile than my project with very few binds.

And I either found a D bug or a lack in documentation about 
extern(C) or this is a very specific case that doesn't happen 
often.

I believe this is the same "bug" as this:
https://forum.dlang.org/thread/fjfftrruedmzdcqmrbci forum.dlang.org

So the problem I am noticing is very simple: as you may know GTK 
callbacks have a void* user_data argument, what happens, is that 
if the flow of code is this: D => C => D the void* pointer when 
reaching the third step and finally emerging to the D language 
will change its value and get corrupted.

A stupid example can be passing a D object using the void* 
user_data GTK argument, passing a extern(C) callback function to 
GTK, and then inside that callback calling a extern(D) function 
that accepts a void*.

I already tried using Variant, but pointers to Variant get 
corrupted too, so it's useless.

I tested this with Valgrind and there are no stack corruptions, 
GDB confirms that any void* pointer doing the D => C => D route 
gets corrupted (changes its value to a random(?) one) in the 
final step.

The TL;DR is: when the flow of code is inside an extern(C) 
function with a D object passed as a void* pointer it seems there 
is no way to pass that object again to another D function 
accepting a void* pointer

Additional Notes:
Tried LDC LLVM and it still happens, so it's not a DMD bug
Jul 28 2019
next sibling parent reply Dennis <dkorpel gmail.com> writes:
On Sunday, 28 July 2019 at 18:32:24 UTC, Federico Santamorena 
wrote:
 So the problem I am noticing is very simple: as you may know 
 GTK callbacks have a void* user_data argument, what happens, is 
 that if the flow of code is this: D => C => D the void* pointer 
 when reaching the third step and finally emerging to the D 
 language will change its value and get corrupted.
Can you give your exact callback function definition? I once had my context pointer corrupted because I marked my callback extern(C) instead of extern(Windows). Maybe you also have an error in your definition somewhere.
Jul 28 2019
next sibling parent Adam D. Ruppe <destructionator gmail.com> writes:
On Sunday, 28 July 2019 at 18:37:26 UTC, Dennis wrote:
 Can you give your exact callback function definition?
Yes, indeed. D's `long` and C's `long` are incompatible too, so using the wrong there is a potential for this kind of problem as well (the argument before the pointer is the wrong size, so then the pointer value is pulled from the wrong location).
Jul 28 2019
prev sibling parent reply Federico Santamorena <federico santamorena.me> writes:
On Sunday, 28 July 2019 at 18:37:26 UTC, Dennis wrote:
 On Sunday, 28 July 2019 at 18:32:24 UTC, Federico Santamorena 
 wrote:
 So the problem I am noticing is very simple: as you may know 
 GTK callbacks have a void* user_data argument, what happens, 
 is that if the flow of code is this: D => C => D the void* 
 pointer when reaching the third step and finally emerging to 
 the D language will change its value and get corrupted.
Can you give your exact callback function definition? I once had my context pointer corrupted because I marked my callback extern(C) instead of extern(Windows). Maybe you also have an error in your definition somewhere.
It's a bit complex but: context is a pointer to a D struct extern(C) void gtk_search_changed(GtkEditable* widget, void* data) { //body } context.search_input.g_signal_connect("changed", &gtk_search_changed, context); Then inside gtk_search_changed, startCrawling gets called and context is just passed to it as the last argument: DrillContext* startCrawling(in const(DrillConfig) config, in immutable(string) searchValue, in immutable(void function(immutable(FileInfo) result, void* userObject)) resultCallback, in void* userObject) { //body } And gets called like this: startCrawling(drillConfig, searchString, &resultFound, context); Then when the resultCallback gets called its userObject (that now should be a pointer to context) is now corrupt: void resultFound(immutable(FileInfo) result, void* userObject) { //body} And now inside resultFound the void* is completely corrupt
Jul 28 2019
next sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Sunday, 28 July 2019 at 18:54:48 UTC, Federico Santamorena 
wrote:
 void resultFound(immutable(FileInfo) result, void* userObject)
Is FileInfo the class from gtkd or is it something else?
Jul 28 2019
parent Federico Santamorena <federico santamorena.me> writes:
On Sunday, 28 July 2019 at 18:59:59 UTC, Adam D. Ruppe wrote:
 On Sunday, 28 July 2019 at 18:54:48 UTC, Federico Santamorena 
 wrote:
 void resultFound(immutable(FileInfo) result, void* userObject)
Is FileInfo the class from gtkd or is it something else?
https://github.com/yatima1460/Drill/blob/576b4b691ea5357e5271115433eb11d4c0beeae7/Source/Backend/FileInfo.d
Jul 28 2019
prev sibling parent Federico Santamorena <federico santamorena.me> writes:
On Sunday, 28 July 2019 at 18:54:48 UTC, Federico Santamorena 
wrote:
 On Sunday, 28 July 2019 at 18:37:26 UTC, Dennis wrote:
 On Sunday, 28 July 2019 at 18:32:24 UTC, Federico Santamorena 
 wrote:
 So the problem I am noticing is very simple: as you may know 
 GTK callbacks have a void* user_data argument, what happens, 
 is that if the flow of code is this: D => C => D the void* 
 pointer when reaching the third step and finally emerging to 
 the D language will change its value and get corrupted.
Can you give your exact callback function definition? I once had my context pointer corrupted because I marked my callback extern(C) instead of extern(Windows). Maybe you also have an error in your definition somewhere.
It's a bit complex but: context is a pointer to a D struct extern(C) void gtk_search_changed(GtkEditable* widget, void* data) { //body } context.search_input.g_signal_connect("changed", &gtk_search_changed, context); Then inside gtk_search_changed, startCrawling gets called and context is just passed to it as the last argument: DrillContext* startCrawling(in const(DrillConfig) config, in immutable(string) searchValue, in immutable(void function(immutable(FileInfo) result, void* userObject)) resultCallback, in void* userObject) { //body } And gets called like this: startCrawling(drillConfig, searchString, &resultFound, context); Then when the resultCallback gets called its userObject (that now should be a pointer to context) is now corrupt: void resultFound(immutable(FileInfo) result, void* userObject) { //body} And now inside resultFound the void* is completely corrupt
Here, the GitHub file: https://github.com/yatima1460/Drill/blob/576b4b691ea5357e5271115433eb11d4c0beeae7/Source/Frontend/GTK/Main.d It's an experimental test using GTK bindings. Here the flow: D: https://github.com/yatima1460/Drill/blob/576b4b691ea5357e5271115433eb11d4c0beeae7/Source/Frontend/GTK/Main.d#L586 extern(C): https://github.com/yatima1460/Drill/blob/576b4b691ea5357e5271115433eb11d4c0beeae7/Source/Frontend/GTK/Main.d#L323 https://github.com/yatima1460/Drill/blob/576b4b691ea5357e5271115433eb11d4c0beeae7/Source/Frontend/GTK/Main.d#L384 D again: https://github.com/yatima1460/Drill/blob/576b4b691ea5357e5271115433eb11d4c0beeae7/Source/Frontend/GTK/Main.d#L299 Now userObject inside resultFound is corrupt
Jul 28 2019
prev sibling parent reply Exil <Exil gmall.com> writes:
On Sunday, 28 July 2019 at 18:32:24 UTC, Federico Santamorena 
wrote:
 A stupid example can be passing a D object using the void* 
 user_data GTK argument, passing a extern(C) callback function 
 to GTK, and then inside that callback calling a extern(D) 
 function that accepts a void*.

 I already tried using Variant, but pointers to Variant get 
 corrupted too, so it's useless.
Where is the "D object" allocated? Haven't seen anyone mention it, but the way the GC works, it has to know about the memory it needs to scan to look for an object. So if you pass an object allocated with the GC then pass it to GTK, the GC isn't going to know about the memory GTK has allocate. The kind of problem can happen that you mention, it deallocates the object because it doesn't think it is used anymore.
Jul 28 2019
next sibling parent =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 07/28/2019 06:46 PM, Exil wrote:
 On Sunday, 28 July 2019 at 18:32:24 UTC, Federico Santamorena wrote:
 A stupid example can be passing a D object using the void* user_data 
 GTK argument, passing a extern(C) callback function to GTK, and then 
 inside that callback calling a extern(D) function that accepts a void*.

 I already tried using Variant, but pointers to Variant get corrupted 
 too, so it's useless.
Where is the "D object" allocated? Haven't seen anyone mention it, but the way the GC works, it has to know about the memory it needs to scan to look for an object. So if you pass an object allocated with the GC then pass it to GTK, the GC isn't going to know about the memory GTK has allocate. The kind of problem can happen that you mention, it deallocates the object because it doesn't think it is used anymore.
For completeness, more other valuable information is here: https://dlang.org/spec/interfaceToC.html Ali
Jul 28 2019
prev sibling parent reply Federico Santamorena <federico santamorena.me> writes:
On Monday, 29 July 2019 at 01:46:27 UTC, Exil wrote:
 On Sunday, 28 July 2019 at 18:32:24 UTC, Federico Santamorena 
 wrote:
 [...]
Where is the "D object" allocated? Haven't seen anyone mention it, but the way the GC works, it has to know about the memory it needs to scan to look for an object. So if you pass an object allocated with the GC then pass it to GTK, the GC isn't going to know about the memory GTK has allocate. The kind of problem can happen that you mention, it deallocates the object because it doesn't think it is used anymore.
I even tried GC.disable(). The void* pointer still gets corrupted
Jul 29 2019
parent reply Jonathan Marler <johnnymarler gmail.com> writes:
On Monday, 29 July 2019 at 08:17:34 UTC, Federico Santamorena 
wrote:
 On Monday, 29 July 2019 at 01:46:27 UTC, Exil wrote:
 On Sunday, 28 July 2019 at 18:32:24 UTC, Federico Santamorena 
 wrote:
 [...]
Where is the "D object" allocated? Haven't seen anyone mention it, but the way the GC works, it has to know about the memory it needs to scan to look for an object. So if you pass an object allocated with the GC then pass it to GTK, the GC isn't going to know about the memory GTK has allocate. The kind of problem can happen that you mention, it deallocates the object because it doesn't think it is used anymore.
I even tried GC.disable(). The void* pointer still gets corrupted
Can you print the raw pointer value of context/userObj at different points to see when the corruption occurs? Print "context" here: https://github.com/yatima1460/Drill/blob/576b4b691ea5357e5271115433eb11d4c0beeae7/Source/Frontend/GTK/Main.d#L585 Print "context" here: https://github.com/yatima1460/Drill/blob/576b4b691ea5357e5271115433eb11d4c0beeae7/Source/Frontend/GTK/Main.d#L384 Print "userObject" here: https://github.com/yatima1460/Drill/blob/576b4b691ea5357e5271115433eb11d4c0beeae7/Source/Backend/Context.d#L187 Print "userObject" here: https://github.com/yatima1460/Drill/blob/576b4b691ea5357e5271115433eb11d4c0beeae7/Source/Backend/Context.d#L195 Then print "userObject" here: https://github.com/yatima1460/Drill/blob/576b4b691ea5357e5271115433eb11d4c0beeae7/Source/Frontend/GTK/Main.d#L304 It looks like you're using printf so "%p" should work to print the pointer.
Jul 29 2019
parent reply Federico Santamorena <federico santamorena.me> writes:
On Monday, 29 July 2019 at 08:35:20 UTC, Jonathan Marler wrote:
 Can you print the raw pointer value of context/userObj at 
 different points to see when the corruption occurs?

 Print "context" here: 
 https://github.com/yatima1460/Drill/blob/576b4b691ea5357e5271115433eb11d4c0beeae7/Source/Frontend/GTK/Main.d#L585

 Print "context" here:
 https://github.com/yatima1460/Drill/blob/576b4b691ea5357e5271115433eb11d4c0beeae7/Source/Frontend/GTK/Main.d#L384

 Print "userObject" here:
 https://github.com/yatima1460/Drill/blob/576b4b691ea5357e5271115433eb11d4c0beeae7/Source/Backend/Context.d#L187

 Print "userObject" here:
 https://github.com/yatima1460/Drill/blob/576b4b691ea5357e5271115433eb11d4c0beeae7/Source/Backend/Context.d#L195

 Then print "userObject" here:
 https://github.com/yatima1460/Drill/blob/576b4b691ea5357e5271115433eb11d4c0beeae7/Source/Frontend/GTK/Main.d#L304

 It looks like you're using printf so "%p" should work to print 
 the pointer.
GC disabled: activate context:0x7ffca2d0c4f8 gtk_search_changed context:0x7ffca2d0c4f8 startCrawling userObject:0x7ffca2d0c4f8 startCrawling foreach loop userObject:0x7ffca2d0c4f8 startCrawling foreach loop userObject:0x7ffca2d0c4f8 startCrawling foreach loop userObject:0x7ffca2d0c4f8 startCrawling foreach loop userObject:0x7ffca2d0c4f8 startCrawling foreach loop userObject:0x7ffca2d0c4f8 startCrawling foreach loop userObject:0x7ffca2d0c4f8 startCrawling foreach loop userObject:0x7ffca2d0c4f8 startCrawling foreach loop userObject:0x7ffca2d0c4f8 startCrawling foreach loop userObject:0x7ffca2d0c4f8 startCrawling foreach loop userObject:0x7ffca2d0c4f8 startCrawling foreach loop userObject:0x7ffca2d0c4f8 startCrawling foreach loop userObject:0x7ffca2d0c4f8 startCrawling foreach loop userObject:0x7ffca2d0c4f8 startCrawling foreach loop userObject:0x7ffca2d0c4f8 startCrawling foreach loop userObject:0x7ffca2d0c4f8 startCrawling foreach loop userObject:0x7ffca2d0c4f8 resultFound result:0x7fc843559c60 userObject:0x7fc8435414d0 (drill-search-gtk:12395): GLib-CRITICAL **: 12:40:07.937: g_async_queue_push: assertion 'queue' failed GC enabled: activate context:0x7ffe9369c298 gtk_search_changed context:0x7ffe9369c298 startCrawling userObject:0x7ffe9369c298 startCrawling foreach loop userObject:0x7ffe9369c298 startCrawling foreach loop userObject:0x7ffe9369c298 startCrawling foreach loop userObject:0x7ffe9369c298 startCrawling foreach loop userObject:0x7ffe9369c298 startCrawling foreach loop userObject:0x7ffe9369c298 startCrawling foreach loop userObject:0x7ffe9369c298 startCrawling foreach loop userObject:0x7ffe9369c298 startCrawling foreach loop userObject:0x7ffe9369c298 startCrawling foreach loop userObject:0x7ffe9369c298 startCrawling foreach loop userObject:0x7ffe9369c298 startCrawling foreach loop userObject:0x7ffe9369c298 startCrawling foreach loop userObject:0x7ffe9369c298 startCrawling foreach loop userObject:0x7ffe9369c298 startCrawling foreach loop userObject:0x7ffe9369c298 startCrawling foreach loop userObject:0x7ffe9369c298 startCrawling foreach loop userObject:0x7ffe9369c298 resultFound result:0x7f4f2625bb00 userObject:0x7f4f2628d440 (drill-search-gtk:12223): GLib-CRITICAL **: 12:38:54.729: g_async_queue_push: assertion 'queue' failed As you can see the queue pointer inside the context is invalid because the context itself is now invalid and GLib notifies it GC on or off does not make any difference, and the corruption happens at resultFound I also want to add that I replaced the struct FileInfo with a pointer and it still happens https://github.com/yatima1460/Drill/commit/5653c831b03d657a0c8073d5b011934f02ac8b65
Jul 29 2019
parent reply Kagamin <spam here.lot> writes:
You pass &context to Crawler, that's wrong pointer.
Jul 29 2019
parent reply Federico Santamorena <federico santamorena.me> writes:
On Monday, 29 July 2019 at 11:01:10 UTC, Kagamin wrote:
 You pass &context to Crawler, that's wrong pointer.
You are the absolute madman it was so obvious yet so hidden. I wonder how this can be avoided, I didn't use Variant because it does not support shared(), but maybe that was my sin. What do you think about the idea of the compiler emitting a warning when the pointer to a void* pointer is assigned? I can't think of a good reason to have a void* pointer to a void* pointer and actually never used one in code. Maybe someone has a counterargument to this proposal?
Jul 29 2019
parent reply Kagamin <spam here.lot> writes:
Use delegates for callbacks to prevent this.
Jul 29 2019
parent reply Kagamin <spam here.lot> writes:
struct DrillGtkContext
{
     static extern(C)
     void gtk_search_changed(GtkEditable* widget, ref 
DrillGtkContext context)
     {
         context.onSearchChanged();
     }
     void onSearchChanged()
     {
         ...
         startCrawling(drillConfig, searchString, &resultFound);
     }
     void resultFound(immutable(FileInfo) result)
     {
         ...
         queue.g_async_queue_push(f);
     }
}

...
DrillContext* startCrawling(
     in const(DrillConfig) config,
     in immutable(string) searchValue,
     scope void delegate(immutable(FileInfo)) resultCallback)
{
     ...
     new Crawler(..., c.callback);
}
Jul 29 2019
next sibling parent reply Federico Santamorena <federico santamorena.me> writes:
On Monday, 29 July 2019 at 12:18:51 UTC, Kagamin wrote:
 struct DrillGtkContext
 {
     static extern(C)
     void gtk_search_changed(GtkEditable* widget, ref 
 DrillGtkContext context)
     {
         context.onSearchChanged();
     }
     void onSearchChanged()
     {
         ...
         startCrawling(drillConfig, searchString, &resultFound);
     }
     void resultFound(immutable(FileInfo) result)
     {
         ...
         queue.g_async_queue_push(f);
     }
 }

 ...
 DrillContext* startCrawling(
     in const(DrillConfig) config,
     in immutable(string) searchValue,
     scope void delegate(immutable(FileInfo)) resultCallback)
 {
     ...
     new Crawler(..., c.callback);
 }
Oooooh I see, I recently switched from C to D and this is a lot cleaner I thought you couldn't use D things like "ref" in extern(C)
Jul 29 2019
parent reply Kagamin <spam here.lot> writes:
On Monday, 29 July 2019 at 12:42:37 UTC, Federico Santamorena 
wrote:
 I thought you couldn't use D things like "ref" in extern(C)
It's not recommended for proper C bindings, as interoperability can be tricky, so they are kept close to original C source. Also to port C code and examples invocation should be as close to C code as possible. I would say your use of UFCS with gtk functions is probably not a good idea, it's difficult to recognize them as gtk functions this way, I had a suspicion something smart is going on there.
Jul 29 2019
parent reply Federico Santamorena <federico santamorena.me> writes:
On Monday, 29 July 2019 at 13:27:51 UTC, Kagamin wrote:
 On Monday, 29 July 2019 at 12:42:37 UTC, Federico Santamorena 
 wrote:
 I thought you couldn't use D things like "ref" in extern(C)
It's not recommended for proper C bindings, as interoperability can be tricky, so they are kept close to original C source. Also to port C code and examples invocation should be as close to C code as possible. I would say your use of UFCS with gtk functions is probably not a good idea, it's difficult to recognize them as gtk functions this way, I had a suspicion something smart is going on there.
I see. I still actually want to push the idea that a void* pointer to a void* pointer should be a warning emitted by the compiler. Or even better a new flag emitting warnings for fishy void* pointers manipulation. Ideas about this?
Jul 29 2019
parent Dennis <dkorpel gmail.com> writes:
On Monday, 29 July 2019 at 15:26:34 UTC, Federico Santamorena 
wrote:
 I still actually want to push the idea that a void* pointer to 
 a void* pointer should be a warning emitted by the compiler.

 Or even better a new flag emitting warnings for fishy void* 
 pointers manipulation.

 Ideas about this?
Quick thoughts: - there are legimate usecases for void** - warnings are generally bad. They are either treated as pedantic non-standard errors, or they pile up until no-one looks at them anymore. - it's going to be annoying in generic code, templates need special checks that &T is not done if T may be a void*. - compilers already have like 100 flags
Jul 29 2019
prev sibling parent Federico Santamorena <federico santamorena.me> writes:
On Monday, 29 July 2019 at 12:18:51 UTC, Kagamin wrote:
 struct DrillGtkContext
 {
     static extern(C)
     void gtk_search_changed(GtkEditable* widget, ref 
 DrillGtkContext context)
     {
         context.onSearchChanged();
     }
     void onSearchChanged()
     {
         ...
         startCrawling(drillConfig, searchString, &resultFound);
     }
     void resultFound(immutable(FileInfo) result)
     {
         ...
         queue.g_async_queue_push(f);
     }
 }

 ...
 DrillContext* startCrawling(
     in const(DrillConfig) config,
     in immutable(string) searchValue,
     scope void delegate(immutable(FileInfo)) resultCallback)
 {
     ...
     new Crawler(..., c.callback);
 }
Thanks to all in this thread, recently all the problems I am having with D are because D is actually superior to C and some assumptions I still have because of C should be uninstalled from my brain Thanks.
Jul 29 2019