digitalmars.D - DLL crash inside removethreadtableentry - where's the source code
- Ben Davis (178/178) Feb 16 2013 Hi,
- Ben Davis (8/26) Feb 16 2013 No, I got confused here - the shift right is equivalent to division by
- Rainer Schuetze (5/30) Feb 16 2013 _removethreadtableentry is a function in the DM C runtime library. It
- Ben Davis (10/14) Feb 17 2013 That's a good start :)
- Rainer Schuetze (8/22) Feb 17 2013 Sure it can be fixed. It's up to Walter to build a new lib for
- Trey Brisbane (4/22) May 11 2013 Sorry to necro this thread, but I'm currently experiencing the
- Walter Bright (4/26) May 11 2013 I thought this was already fixed. What's the date/size on your snn.lib? ...
- Trey Brisbane (9/12) May 11 2013 In dmd.2.062.zip (the one I'm using):
- Trey Brisbane (2/2) May 11 2013 Yep, problem solved.
Hi, The user-mode driver I'm working on (a 32-bit DLL) is crashing Windows Media Player on exit. (Two other host apps exit fine.) I can catch it in the Visual Studio debugger, but only see assembly language. Initially I'm just after tips on where to find source for the bits of D that are involved, but maybe someone will recognise the problem already... I've gone through the assembly in some detail, and established that the crash is inside some removethreadtableentry() code which is called shortly before DllMain(DLL_THREAD_DETACH), and must look something like: //tid is the Windows numeric thread ID for the current thread removethreadtableentry(tid) { foreach (i, obj in someObjArray1024EntriesLong) { if (obj.someField == tid) goto foundIt; } return; //When we get here, i is 1 (pretend it's in scope) foundIt: free(obj.something); //Does nothing, already 0 if (obj.somethingElse) { //Does nothing, already 0 CloseHandle(obj.somethingElse); } free(obj); //Crash inside this free() } Furthermore, I've established that: - removethreadtableentry() doesn't get to foundIt for most threads. - (almost certain) removethreadtableentry() isn't called at all for one of the two host apps that work fine; and is called but doesn't get to foundIt for the other app. - (almost certain) removethreadtableentry() crashes the first time it gets to foundIt. (These are almost certain in the sense that I only set the breakpoint after catching the first on-shutdown DLL_THREAD_DETACH, which means I may have missed one; but it's unlikely.) So basically this seems to point to some buggy code that hardly ever runs, but does in my case. (Or it's designed for a slightly different use of DLLs or something like that.) For reference, the assembly language I analysed is below, but I think the next step is if someone either wants to fix removethreadtableentry(), or direct me to the source so I can investigate further. (It is a D function, is it? It looks like D naming as opposed to Microsoft naming.) I'm off to bed, but will pick this up again tomorrow. Full detail follows (but probably isn't worth reading). The call stack looks like this: myproject.dll!RTLMultiPool::SelectFree() + 0x17 bytes C++ myproject.dll!__removethreadtableentry() + 0x69 bytes C++ myproject.dll!__DllMainCRTStartup 12() + 0x10c bytes C++ ntdll.dll!_LdrpCallInitRoutine 16() + 0x14 bytes ntdll.dll!_LdrShutdownThread 0() + 0xe2 bytes ntdll.dll!_RtlExitUserThread 4() + 0x2a bytes kernel32.dll! BaseThreadInitThunk 12() + 0x19 bytes ntdll.dll!___RtlUserThreadStart 8() + 0x27 bytes ntdll.dll!__RtlUserThreadStart 8() + 0x1b bytes When I view the assembly for __DllMainCRTStartup, I can see that this is the function directly responsible for calling my DllMain function. There seems to be only one place where it calls removethreadtableentry, and it seems to be before a call to DllMain. When I look at the assembly for removethreadtableentry, it's trying to make the last call to 'free' before returning, as follows: __removethreadtableentry: 05A88F64 push eax 05A88F65 mov ecx,dword ptr [esp+8] 05A88F69 xor edx,edx 05A88F6B push ebx 05A88F6C push esi 05A88F6D jmp __removethreadtableentry+0Fh (5A88F73h) 05A88F6F pop esi 05A88F70 pop ebx 05A88F71 pop eax 05A88F72 ret 05A88F73 mov eax,dword ptr [___thdtbl (5AADFBCh)] 05A88F78 mov ebx,dword ptr [eax+edx*4] 05A88F7B test ebx,ebx 05A88F7D je __removethreadtableentry+20h (5A88F84h) 05A88F7F cmp dword ptr [ebx+18h],ecx 05A88F82 je __removethreadtableentry+2Bh (5A88F8Fh) 05A88F84 inc edx 05A88F85 cmp edx,400h 05A88F8B je __removethreadtableentry+0Bh (5A88F6Fh) 05A88F8D jmp __removethreadtableentry+0Fh (5A88F73h) 05A88F8F mov dword ptr [esp+8],edx * 05A88F93 mov ecx,dword ptr [esp+8] 05A88F97 mov edx,dword ptr [___thdtbl (5AADFBCh)] 05A88F9D mov esi,dword ptr [___thdtbl (5AADFBCh)] 05A88FA3 mov ebx,dword ptr [edx+ecx*4] 05A88FA6 mov dword ptr [esi+ecx*4],0 05A88FAD push dword ptr [ebx+20h] 05A88FB0 call _free (5A87118h) 05A88FB5 add esp,4 05A88FB8 cmp dword ptr [ebx+1Ch],0 05A88FBC je __removethreadtableentry+63h (5A88FC7h) 05A88FBE push dword ptr [ebx+1Ch] 05A88FC1 call dword ptr [__imp__CloseHandle 4 (5A42B28h)] 05A88FC7 push ebx 05A88FC8 call _free (5A87118h) <-------------------- 05A88FCD add esp,4 05A88FD0 pop esi 05A88FD1 pop ebx 05A88FD2 pop eax 05A88FD3 ret The crash is then somewhere deep inside free(). Further debugging shows that removethreadtableentry is searching through a 1024-entry array of pointers, looking for a non-null pointer to an object for which the field at offset 0x18 is the current thread ID (which is in ecx). If it finds it, then it jumps to the point where I put the *. The crash seems to happen the very first time this line is hit (at least since I put the breakpoint there, which was after the first call into my DllMain). So in summary: a number of threads (7 to 10) get successfully detached first, but weren't in the table that removethreadtableentry is searching. For the first thread to be found in that table, it crashed. Finally, here's everything from the * to the call to free() (on a different run, so different addresses), with some values annotated: //edx is 1, so it's the second entry in the table. 05C08F8F mov dword ptr [esp+8],edx 05C08F93 mov ecx,dword ptr [esp+8] //These set edx and esi to 0x05c2cd40. 05C08F97 mov edx,dword ptr [___thdtbl (5C2DFBCh)] 05C08F9D mov esi,dword ptr [___thdtbl (5C2DFBCh)] //ecx is 1, and ebx becomes 0x05c29b9b. 05C08FA3 mov ebx,dword ptr [edx+ecx*4] 05C08FA6 mov dword ptr [esi+ecx*4],0 //This pushes 0, and the call to free() does nothing. 05C08FAD push dword ptr [ebx+20h] 05C08FB0 call _free (5C07118h) 05C08FB5 add esp,4 //This is 0 and the CloseHandle call is skipped. 05C08FB8 cmp dword ptr [ebx+1Ch],0 05C08FBC je __removethreadtableentry+63h (5C08FC7h) 05C08FBE push dword ptr [ebx+1Ch] 05C08FC1 call dword ptr [__imp__CloseHandle 4 (5BC2B28h)] //ebx is unchanged from above, and this call crashes. 05C08FC7 push ebx 05C08FC8 call _free (5C07118h) I also stepped inside free(), and the next interesting stuff happens here (note I skipped free() itself and went straight to RTLMultiPool): RTLMultiPool::Free: 05C0AC68 push ecx 05C0AC69 cmp dword ptr [esp+8],0 05C0AC6E je RTLMultiPool::Free+15h (5C0AC7Dh) 05C0AC70 mov eax,dword ptr [esp+8] //eax is now 0x05c29b9b, the pointer we're trying to free 05C0AC74 lea edx,[eax-4] //edx is now eax-4 = 0x05c29b97 05C0AC77 push edx 05C0AC78 call RTLMultiPool::SelectFree (5C0AC34h) ... RTLMultiPool::SelectFree: 05C0AC34 push ecx //This reads 0x05c29b97 into eax 05C0AC35 mov eax,dword ptr [esp+8] //This reads an address from where eax points, and edx is 0 05C0AC39 mov edx,dword ptr [eax] 05C0AC3B push ebx 05C0AC3C push esi //Looking at ecx+4 revealed the value 0x00000080 (128) 05C0AC3D cmp edx,dword ptr [ecx+4] 05C0AC40 ja RTLMultiPool::SelectFree+21h (5C0AC55h) //So we get here 05C0AC42 lea ebx,[edx-1] //ebx = 0xffffffff 05C0AC45 shr ebx,3 //ebx = 0x1fffffff 05C0AC48 push eax 05C0AC49 mov esi,dword ptr [ecx] //esi = 0x0516000c 05C0AC4B mov ecx,dword ptr [esi+ebx*4] //crash! I suppose esi + 0x1fffffff*4 is basically esi-4. But then we get: Unhandled exception at 0x05c0ac4b (myproject.dll) in wmplayer.exe: 0xC0000005: Access violation reading location 0x85160008. //Here's the rest of SelectFree FWIW. 05C0AC4E call RTLPool::Free (5C0D460h) 05C0AC53 jmp RTLMultiPool::SelectFree+2Dh (5C0AC61h) 05C0AC55 mov ecx,dword ptr [RTLHeap::pMainHeap (5C2B4FCh)] 05C0AC5B push eax 05C0AC5C call RTLHeap::Free (5C0D6B4h) 05C0AC61 pop esi 05C0AC62 pop ebx 05C0AC63 pop eax 05C0AC64 ret 4 05C0AC67 int 3
Feb 16 2013
Correction to my hideous analysis inside free :P On 17/02/2013 03:07, Ben Davis wrote:RTLMultiPool::SelectFree: 05C0AC34 push ecx //This reads 0x05c29b97 into eax 05C0AC35 mov eax,dword ptr [esp+8] //This reads an address from where eax points, and edx is 0 05C0AC39 mov edx,dword ptr [eax] 05C0AC3B push ebx 05C0AC3C push esi //Looking at ecx+4 revealed the value 0x00000080 (128) 05C0AC3D cmp edx,dword ptr [ecx+4] 05C0AC40 ja RTLMultiPool::SelectFree+21h (5C0AC55h) //So we get here 05C0AC42 lea ebx,[edx-1] //ebx = 0xffffffff 05C0AC45 shr ebx,3 //ebx = 0x1fffffff 05C0AC48 push eax 05C0AC49 mov esi,dword ptr [ecx] //esi = 0x0516000c 05C0AC4B mov ecx,dword ptr [esi+ebx*4] //crash! I suppose esi + 0x1fffffff*4 is basically esi-4. But then we get:No, I got confused here - the shift right is equivalent to division by 8, not by 4. So the address [esi + 0x1fffffff*4] is very likely to be very wrong. This implies that edx being 0 is bad. I'd inclined to guess at maybe a double freeing, or maybe freeing an address that isn't even a heap address. It's also very interesting that the address we're trying to free is completely unaligned (an odd number).
Feb 16 2013
On 17.02.2013 04:07, Ben Davis wrote:Hi, The user-mode driver I'm working on (a 32-bit DLL) is crashing Windows Media Player on exit. (Two other host apps exit fine.) I can catch it in the Visual Studio debugger, but only see assembly language. Initially I'm just after tips on where to find source for the bits of D that are involved, but maybe someone will recognise the problem already... I've gone through the assembly in some detail, and established that the crash is inside some removethreadtableentry() code which is called shortly before DllMain(DLL_THREAD_DETACH), and must look something like: //tid is the Windows numeric thread ID for the current thread removethreadtableentry(tid) { foreach (i, obj in someObjArray1024EntriesLong) { if (obj.someField == tid) goto foundIt; } return; //When we get here, i is 1 (pretend it's in scope) foundIt: free(obj.something); //Does nothing, already 0 if (obj.somethingElse) { //Does nothing, already 0 CloseHandle(obj.somethingElse); } free(obj); //Crash inside this free() } Furthermore, I've established that: - removethreadtableentry() doesn't get to foundIt for most threads._removethreadtableentry is a function in the DM C runtime library. It has the bug that it tries to free a data record that has never been allocated if the thread that loaded the DLL is terminated. This is the entry at index 1.
Feb 16 2013
On 17/02/2013 07:56, Rainer Schuetze wrote:_removethreadtableentry is a function in the DM C runtime library. It has the bug that it tries to free a data record that has never been allocated if the thread that loaded the DLL is terminated. This is the entry at index 1.That's a good start :) Can it be fixed? Who would be able to do it? Or is there some code I can put in my project that will successfully work around the issue? I get the impression the source is available for money. I found this page http://www.digitalmars.com/download/freecompiler.html which mentions complete library source under a link to the shop. I *could* buy it and see if I can fix it myself, but it seems a bit risky. By the way, thanks for Visual D :)
Feb 17 2013
On 17.02.2013 12:31, Ben Davis wrote:On 17/02/2013 07:56, Rainer Schuetze wrote:Sure it can be fixed. It's up to Walter to build a new lib for distribution, though._removethreadtableentry is a function in the DM C runtime library. It has the bug that it tries to free a data record that has never been allocated if the thread that loaded the DLL is terminated. This is the entry at index 1.That's a good start :) Can it be fixed? Who would be able to do it?Or is there some code I can put in my project that will successfully work around the issue?Without recompiling the lib, I guess the best that can be done is patch snn.lib to not execute the last call to free().I get the impression the source is available for money. I found this page http://www.digitalmars.com/download/freecompiler.html which mentions complete library source under a link to the shop. I *could* buy it and see if I can fix it myself, but it seems a bit risky.Yes, you get library source and a lot more. The risk is pretty limited, it is not very expensive.By the way, thanks for Visual D :)Thanks :-)
Feb 17 2013
On Sunday, 17 February 2013 at 11:32:02 UTC, Ben Davis wrote:On 17/02/2013 07:56, Rainer Schuetze wrote:Sorry to necro this thread, but I'm currently experiencing the exact same issue. Was this ever fixed? If not, was there a bug filed?_removethreadtableentry is a function in the DM C runtime library. It has the bug that it tries to free a data record that has never been allocated if the thread that loaded the DLL is terminated. This is the entry at index 1.That's a good start :) Can it be fixed? Who would be able to do it? Or is there some code I can put in my project that will successfully work around the issue? I get the impression the source is available for money. I found this page http://www.digitalmars.com/download/freecompiler.html which mentions complete library source under a link to the shop. I *could* buy it and see if I can fix it myself, but it seems a bit risky. By the way, thanks for Visual D :)
May 11 2013
On 5/11/2013 12:10 AM, Trey Brisbane wrote:On Sunday, 17 February 2013 at 11:32:02 UTC, Ben Davis wrote:I thought this was already fixed. What's the date/size on your snn.lib? The latest is: 02/25/2013 06:19 PM 573,952 snn.libOn 17/02/2013 07:56, Rainer Schuetze wrote:Sorry to necro this thread, but I'm currently experiencing the exact same issue. Was this ever fixed? If not, was there a bug filed?_removethreadtableentry is a function in the DM C runtime library. It has the bug that it tries to free a data record that has never been allocated if the thread that loaded the DLL is terminated. This is the entry at index 1.That's a good start :) Can it be fixed? Who would be able to do it? Or is there some code I can put in my project that will successfully work around the issue? I get the impression the source is available for money. I found this page http://www.digitalmars.com/download/freecompiler.html which mentions complete library source under a link to the shop. I *could* buy it and see if I can fix it myself, but it seems a bit risky. By the way, thanks for Visual D :)
May 11 2013
On Saturday, 11 May 2013 at 07:38:53 UTC, Walter Bright wrote:I thought this was already fixed. What's the date/size on your snn.lib? The latest is: 02/25/2013 06:19 PM 573,952 snn.libIn dmd.2.062.zip (the one I'm using): 574,464 2012-12-11 7:30 AM In dmc.zip: 573,952 2013-02-26 11:19 AM <-- the one I should be using? In dmc856.zip (from the Digital Mars site): 574,464 2012-12-11 7:30 AM Shouldn't these be in sync? :P Anyway, thanks for the tip. I'll give it a shot and post back.
May 11 2013
Yep, problem solved. Thanks very much for your help! :)
May 11 2013