Prev: How to list mapped drives in dialog running as service?
Next: No stack shown while remote debugging a crash
From: Olegas on 3 May 2010 15:44 Hello there, I'm looking for some suggestions in regard to the issue below. If this is not appropriate group, please redirect me. I'm trying to track down root cause for a fairly elusive problem that occurs once in awhile and only when full page heap is enabled. The problem is not reproducible on demand when we enable full page heap, but symptoms suggest a heap corruption. Not sure if this is some type of timing or synchronization issue. Basically, we get crash dumps from our customers that show: # 0 Id: 1f0.810 Suspend: 1 Teb: 7ffdf000 Unfrozen ChildEBP RetAddr Args to Child 0012fa64 7c90df5a 7c91b24b 00000064 00000000 ntdll!KiFastSystemCallRet 0012fa68 7c91b24b 00000064 00000000 00000000 ntdll!ZwWaitForSingleObject+0xc 0012faf0 7c901046 0197e178 7c912cce 7c97e178 ntdll!RtlpWaitForCriticalSection+0x132 0012faf8 7c912cce 7c97e178 00000000 006ff238 ntdll!RtlEnterCriticalSection+0x46 0012fb34 7c80b4af 00000001 00000000 0012fb70 ntdll!LdrLockLoaderLock+0xea 0012fb8c 7c80b5ad 00400000 02c5cdf8 00000104 kernel32!GetModuleFileNameW+0x89 0012fbb8 73e69cc8 00400000 0012fcdc 00000104 kernel32!GetModuleFileNameA+0x4b 0012fee8 73e69c67 006ff238 73ddcf57 00400000 mfc42!CWinApp::SetCurrentHandles+0x45 0012fef0 73ddcf57 00400000 00000000 00152ffc mfc42!AfxWinInit+0x4f 0012ff10 0065f893 00400000 00000000 00152ffc mfc42!AfxWinMain+0x2c 0012ff24 0056f6f2 00400000 00000000 00152ffc MyApp!WinMain+0x15 [appmodul.cpp @ 30] 0012ffc0 7c817077 80000001 0145e0c4 7ffdd000 MyApp!WinMainCRTStartup+0x134 0012fff0 00000000 0056f5be 00000000 78746341 kernel32!BaseProcessStart+0x23 1 Id: 1f0.8c8 Suspend: 1 Teb: 7ffde000 Unfrozen ChildEBP RetAddr Args to Child 02a0f650 7c90df4a 7c8648a2 00000002 02a0f838 ntdll!KiFastSystemCallRet 02a0f654 7c8648a2 00000002 02a0f838 00000001 ntdll!ZwWaitForMultipleObjects+0xc 02a0fa08 77c32f0f 02a0fa50 00000000 00000000 kernel32!UnhandledExceptionFilter+0x8b9 02a0fa24 77c3a3c7 00000000 02a0fa50 77c35cf5 msvcrt!_XcptFilter+0x161 02a0fa30 77c35cf5 02a0fa58 00000000 02a0fa58 msvcrt!_endthreadex+0xc0 02a0fa58 7c9032a8 02a0fb44 02a0ffa4 02a0fb60 msvcrt!_except_handler3+0x61 02a0fa7c 7c90327a 02a0fb44 02a0ffa4 02a0fb60 ntdll!ExecuteHandler2+0x26 02a0fb2c 7c90e48a 00000000 02a0fb60 02a0fb44 ntdll!ExecuteHandler+0x24 02a0fb2c 77c47631 00000000 02a0fb60 02a0fb44 ntdll!KiUserExceptionDispatcher+0xe 02a0fe2c 73e68a0e 02c72fe0 00000000 00000024 msvcrt!memset+0x41 02a0fe48 73e682ca 00000001 02c6eee8 00000000 mfc42!CThreadSlotData::SetValue+0xb9 02a0fe5c 73e6842a 73e688db 00000003 02171891 mfc42!CThreadLocalObject::GetData+0x55 02a0fe68 02171891 021752e8 00000003 80284006 mfc42!AFX_MAINTAIN_STATE2::AFX_MAINTAIN_STATE2+0x14 02a0fe80 02171a7b 02170000 00000003 00000000 MyDll!DllMain+0x102 [dllmodul.cpp @ 169] 02a0fea0 7c90118a 02170000 00000003 00000000 MyDll!_DllMainCRTStartup+0x50 02a0fec0 7c913a43 02171a2b 02170000 00000003 ntdll!LdrpCallInitRoutine+0x14 02a0ff38 7c80c136 7c968f03 02786f78 026fff78 ntdll!LdrShutdownThread+0xd7 02a0ff70 77c3a33b 00000000 02786f78 02a0ffb4 kernel32!ExitThread+0x3e 02a0ff80 77c3a3b5 00000000 7c968f03 02695000 msvcrt!_endthreadex+0x34 02a0ffb4 7c80b729 026fff78 7c968f03 02695000 msvcrt!_endthreadex+0xae 2 Id: 1f0.8e8 Suspend: 1 Teb: 7ffdc000 Unfrozen ChildEBP RetAddr Args to Child 0300fc0c 7c90df5a 7c91b24b 00000064 00000000 ntdll!KiFastSystemCallRet 0300fc10 7c91b24b 00000064 00000000 00000000 ntdll!ZwWaitForSingleObject+0xc 0300fc98 7c901046 0197e178 7c91e3b5 7c97e178 ntdll!RtlpWaitForCriticalSection+0x132 0300fca0 7c91e3b5 7c97e178 0300fd2c 00000004 ntdll!RtlEnterCriticalSection+0x46 0300fd18 7c90e457 0300fd2c 7c900000 00000000 ntdll!_LdrpInitialize+0xf0 00000000 00000000 00000000 00000000 00000000 ntdll!KiUserApcDispatcher+0x7 All 3 threads were created about 30 seconds ago, but consumed hardly any CPU time. And it seems as the application is shutting down. 0:001> !runaway 7 User Mode Time Thread Time 2:8e8 0 days 0:00:00.000 1:8c8 0 days 0:00:00.000 0:810 0 days 0:00:00.000 Kernel Mode Time Thread Time 0:810 0 days 0:00:00.140 1:8c8 0 days 0:00:00.031 2:8e8 0 days 0:00:00.000 Elapsed Time Thread Time 0:810 0 days 0:00:29.746 1:8c8 0 days 0:00:29.590 2:8e8 0 days 0:00:29.511 NT loader lock is owned by thread 1. Threads 0 and 2 are blocked on the NT loader lock's critical section. 0:001> !cs -l ----------------------------------------- DebugInfo = 0x7c97e1a0 Critical section = 0x7c97e178 (ntdll!LdrpLoaderLock+0x0) LOCKED LockCount = 0x2 OwningThread = 0x000008c8 RecursionCount = 0x1 LockSemaphore = 0x64 SpinCount = 0x00000000 So, an access violation takes place within thread 1's context. The access violation occurs while setting memory block 02c72fe0 to 0 with size of 36 bytes. However, the memory block in question was allocated 4 bytes less than what memset is using, which is why we get access violation. 0:001> !heap -p -a 02c72fe0 address 02c72fe0 found in _DPH_HEAP_ROOT @ 141000 in busy allocation ( DPH_HEAP_BLOCK: UserAddr UserSize - VirtAddr VirtSize) 142b24: 2c72fe0 20 - 2c72000 2000 7c918f21 ntdll!RtlAllocateHeap+0x00000e64 7c809a7f kernel32!LocalAlloc+0x00000058 73e689d7 mfc42!CThreadSlotData::SetValue+0x00000082 02171a7b MyDll!_DllMainCRTStartup+0x00000050 7c90118a ntdll!LdrpCallInitRoutine+0x00000014 7c913a43 ntdll!LdrShutdownThread+0x000000d7 7c80c136 kernel32!ExitThread+0x0000003e 77c3a33b msvcrt!_endthreadex+0x00000034 77c3a3b6 msvcrt!_endthreadex+0x000000af 7c80b729 kernel32!BaseThreadStart+0x00000037 The disassembly below matches the code snippet from AFXTLS.cpp. 73e689f0 e8c0d2fbff call mfc42!AfxThrowMemoryException (73e25cb5) 73e689f5 8b4608 mov eax,dword ptr [esi+8] 73e689f8 8b4f0c mov ecx,dword ptr [edi+0Ch] 73e689fb 2bc8 sub ecx,eax 73e689fd c1e102 shl ecx,2 73e68a00 51 push ecx 73e68a01 8b4e0c mov ecx,dword ptr [esi+0Ch] 73e68a04 8d0481 lea eax,[ecx+eax*4] 73e68a07 53 push ebx 73e68a08 50 push eax 73e68a09 e8748ff6ff call mfc42!memset (73dd1982) void CThreadSlotData::SetValue(int nSlot, void* pValue) <snip> if (pData->pData == NULL) AfxThrowMemoryException(); // initialize the newly allocated part memset(pData->pData + pData->nCount, 0, (m_nMax - pData->nCount) * sizeof(LPVOID)); So, either (pData->pData + pData->nCount) points to a wrong memory block or (m_nMax - pData->nCount) comes up with wrong byte count. Does anyone have any suggestions on how to track down this issue? Thank you, Olegas
From: Joseph M. Newcomer on 3 May 2010 20:32 It looks to me like you are seeing a problem in synchronization. You have not shown that any heap function is on the stack at this point. Note that heap corruption can occur at any time; when you get a message about heap corruption, it is not saying "I have caused heap corruption" it is saying "I finally discovered that some time in the unknown past, somebody has corrupted the heap". The damage could be trillions of instructions in the past. While this is a good bug report, unfortunately it isn't much help in tracking down what the correuption is or who did it. These still rank as the nastiest possible bugs to find and fix. When you say "full page heap enabled" what are you doing to enable it? joe On Mon, 3 May 2010 12:44:41 -0700, Olegas <Olegas(a)community.nospam> wrote: >Hello there, > >I�m looking for some suggestions in regard to the issue below. If this is >not appropriate group, please redirect me. > >I�m trying to track down root cause for a fairly elusive problem that occurs >once in awhile and only when full page heap is enabled. The problem is not >reproducible on demand when we enable full page heap, but symptoms suggest a >heap corruption. Not sure if this is some type of timing or synchronization >issue. > >Basically, we get crash dumps from our customers that show: ># 0 Id: 1f0.810 Suspend: 1 Teb: 7ffdf000 Unfrozen >ChildEBP RetAddr Args to Child >0012fa64 7c90df5a 7c91b24b 00000064 00000000 ntdll!KiFastSystemCallRet >0012fa68 7c91b24b 00000064 00000000 00000000 ntdll!ZwWaitForSingleObject+0xc >0012faf0 7c901046 0197e178 7c912cce 7c97e178 >ntdll!RtlpWaitForCriticalSection+0x132 >0012faf8 7c912cce 7c97e178 00000000 006ff238 >ntdll!RtlEnterCriticalSection+0x46 >0012fb34 7c80b4af 00000001 00000000 0012fb70 ntdll!LdrLockLoaderLock+0xea >0012fb8c 7c80b5ad 00400000 02c5cdf8 00000104 kernel32!GetModuleFileNameW+0x89 >0012fbb8 73e69cc8 00400000 0012fcdc 00000104 kernel32!GetModuleFileNameA+0x4b >0012fee8 73e69c67 006ff238 73ddcf57 00400000 >mfc42!CWinApp::SetCurrentHandles+0x45 >0012fef0 73ddcf57 00400000 00000000 00152ffc mfc42!AfxWinInit+0x4f >0012ff10 0065f893 00400000 00000000 00152ffc mfc42!AfxWinMain+0x2c >0012ff24 0056f6f2 00400000 00000000 00152ffc MyApp!WinMain+0x15 >[appmodul.cpp @ 30] >0012ffc0 7c817077 80000001 0145e0c4 7ffdd000 MyApp!WinMainCRTStartup+0x134 >0012fff0 00000000 0056f5be 00000000 78746341 kernel32!BaseProcessStart+0x23 > > 1 Id: 1f0.8c8 Suspend: 1 Teb: 7ffde000 Unfrozen >ChildEBP RetAddr Args to Child >02a0f650 7c90df4a 7c8648a2 00000002 02a0f838 ntdll!KiFastSystemCallRet >02a0f654 7c8648a2 00000002 02a0f838 00000001 >ntdll!ZwWaitForMultipleObjects+0xc >02a0fa08 77c32f0f 02a0fa50 00000000 00000000 >kernel32!UnhandledExceptionFilter+0x8b9 >02a0fa24 77c3a3c7 00000000 02a0fa50 77c35cf5 msvcrt!_XcptFilter+0x161 >02a0fa30 77c35cf5 02a0fa58 00000000 02a0fa58 msvcrt!_endthreadex+0xc0 >02a0fa58 7c9032a8 02a0fb44 02a0ffa4 02a0fb60 msvcrt!_except_handler3+0x61 >02a0fa7c 7c90327a 02a0fb44 02a0ffa4 02a0fb60 ntdll!ExecuteHandler2+0x26 >02a0fb2c 7c90e48a 00000000 02a0fb60 02a0fb44 ntdll!ExecuteHandler+0x24 >02a0fb2c 77c47631 00000000 02a0fb60 02a0fb44 >ntdll!KiUserExceptionDispatcher+0xe >02a0fe2c 73e68a0e 02c72fe0 00000000 00000024 msvcrt!memset+0x41 >02a0fe48 73e682ca 00000001 02c6eee8 00000000 >mfc42!CThreadSlotData::SetValue+0xb9 >02a0fe5c 73e6842a 73e688db 00000003 02171891 >mfc42!CThreadLocalObject::GetData+0x55 >02a0fe68 02171891 021752e8 00000003 80284006 >mfc42!AFX_MAINTAIN_STATE2::AFX_MAINTAIN_STATE2+0x14 >02a0fe80 02171a7b 02170000 00000003 00000000 MyDll!DllMain+0x102 >[dllmodul.cpp @ 169] >02a0fea0 7c90118a 02170000 00000003 00000000 MyDll!_DllMainCRTStartup+0x50 >02a0fec0 7c913a43 02171a2b 02170000 00000003 ntdll!LdrpCallInitRoutine+0x14 >02a0ff38 7c80c136 7c968f03 02786f78 026fff78 ntdll!LdrShutdownThread+0xd7 >02a0ff70 77c3a33b 00000000 02786f78 02a0ffb4 kernel32!ExitThread+0x3e >02a0ff80 77c3a3b5 00000000 7c968f03 02695000 msvcrt!_endthreadex+0x34 >02a0ffb4 7c80b729 026fff78 7c968f03 02695000 msvcrt!_endthreadex+0xae > > 2 Id: 1f0.8e8 Suspend: 1 Teb: 7ffdc000 Unfrozen >ChildEBP RetAddr Args to Child >0300fc0c 7c90df5a 7c91b24b 00000064 00000000 ntdll!KiFastSystemCallRet >0300fc10 7c91b24b 00000064 00000000 00000000 ntdll!ZwWaitForSingleObject+0xc >0300fc98 7c901046 0197e178 7c91e3b5 7c97e178 >ntdll!RtlpWaitForCriticalSection+0x132 >0300fca0 7c91e3b5 7c97e178 0300fd2c 00000004 >ntdll!RtlEnterCriticalSection+0x46 >0300fd18 7c90e457 0300fd2c 7c900000 00000000 ntdll!_LdrpInitialize+0xf0 >00000000 00000000 00000000 00000000 00000000 ntdll!KiUserApcDispatcher+0x7 > >All 3 threads were created about 30 seconds ago, but consumed hardly any CPU >time. And it seems as the application is shutting down. >0:001> !runaway 7 > User Mode Time > Thread Time > 2:8e8 0 days 0:00:00.000 > 1:8c8 0 days 0:00:00.000 > 0:810 0 days 0:00:00.000 > Kernel Mode Time > Thread Time > 0:810 0 days 0:00:00.140 > 1:8c8 0 days 0:00:00.031 > 2:8e8 0 days 0:00:00.000 > Elapsed Time > Thread Time > 0:810 0 days 0:00:29.746 > 1:8c8 0 days 0:00:29.590 > 2:8e8 0 days 0:00:29.511 > >NT loader lock is owned by thread 1. Threads 0 and 2 are blocked on the NT >loader lock�s critical section. >0:001> !cs -l >----------------------------------------- >DebugInfo = 0x7c97e1a0 >Critical section = 0x7c97e178 (ntdll!LdrpLoaderLock+0x0) >LOCKED >LockCount = 0x2 >OwningThread = 0x000008c8 >RecursionCount = 0x1 >LockSemaphore = 0x64 >SpinCount = 0x00000000 > >So, an access violation takes place within thread 1�s context. The access >violation occurs while setting memory block 02c72fe0 to 0 with size of 36 >bytes. However, the memory block in question was allocated 4 bytes less than >what memset is using, which is why we get access violation. > >0:001> !heap -p -a 02c72fe0 > address 02c72fe0 found in > _DPH_HEAP_ROOT @ 141000 > in busy allocation ( DPH_HEAP_BLOCK: UserAddr UserSize >- VirtAddr VirtSize) > 142b24: 2c72fe0 20 >- 2c72000 2000 > 7c918f21 ntdll!RtlAllocateHeap+0x00000e64 > 7c809a7f kernel32!LocalAlloc+0x00000058 > 73e689d7 mfc42!CThreadSlotData::SetValue+0x00000082 > 02171a7b MyDll!_DllMainCRTStartup+0x00000050 > 7c90118a ntdll!LdrpCallInitRoutine+0x00000014 > 7c913a43 ntdll!LdrShutdownThread+0x000000d7 > 7c80c136 kernel32!ExitThread+0x0000003e > 77c3a33b msvcrt!_endthreadex+0x00000034 > 77c3a3b6 msvcrt!_endthreadex+0x000000af > 7c80b729 kernel32!BaseThreadStart+0x00000037 > > >The disassembly below matches the code snippet from AFXTLS.cpp. >73e689f0 e8c0d2fbff call mfc42!AfxThrowMemoryException (73e25cb5) >73e689f5 8b4608 mov eax,dword ptr [esi+8] >73e689f8 8b4f0c mov ecx,dword ptr [edi+0Ch] >73e689fb 2bc8 sub ecx,eax >73e689fd c1e102 shl ecx,2 >73e68a00 51 push ecx >73e68a01 8b4e0c mov ecx,dword ptr [esi+0Ch] >73e68a04 8d0481 lea eax,[ecx+eax*4] >73e68a07 53 push ebx >73e68a08 50 push eax >73e68a09 e8748ff6ff call mfc42!memset (73dd1982) > > >void CThreadSlotData::SetValue(int nSlot, void* pValue) ><snip> > if (pData->pData == NULL) > AfxThrowMemoryException(); > > // initialize the newly allocated part > memset(pData->pData + pData->nCount, 0, > (m_nMax - pData->nCount) * >sizeof(LPVOID)); > >So, either (pData->pData + pData->nCount) points to a wrong memory block or >(m_nMax - pData->nCount) comes up with wrong byte count. > >Does anyone have any suggestions on how to track down this issue? > >Thank you, >Olegas Joseph M. Newcomer [MVP] email: newcomer(a)flounder.com Web: http://www.flounder.com MVP Tips: http://www.flounder.com/mvp_tips.htm
From: Alexandre Grigoriev on 3 May 2010 23:25 Was Mydll loaded before the failing thread started or after? Set a breakpoint in MyDll!_DllMainCRTStartup in DLL_THREAD_DETACH handler and see what's going on there. "Olegas" <Olegas(a)community.nospam> wrote in message news:248EDA04-69F7-4A08-B8A5-6423B0E45A51(a)microsoft.com... > Hello there, > > I'm looking for some suggestions in regard to the issue below. If this is > not appropriate group, please redirect me. > > I'm trying to track down root cause for a fairly elusive problem that > occurs > once in awhile and only when full page heap is enabled. The problem is not > reproducible on demand when we enable full page heap, but symptoms suggest > a > heap corruption. Not sure if this is some type of timing or > synchronization > issue. > > Basically, we get crash dumps from our customers that show: > # 0 Id: 1f0.810 Suspend: 1 Teb: 7ffdf000 Unfrozen > ChildEBP RetAddr Args to Child > 0012fa64 7c90df5a 7c91b24b 00000064 00000000 ntdll!KiFastSystemCallRet > 0012fa68 7c91b24b 00000064 00000000 00000000 > ntdll!ZwWaitForSingleObject+0xc > 0012faf0 7c901046 0197e178 7c912cce 7c97e178 > ntdll!RtlpWaitForCriticalSection+0x132 > 0012faf8 7c912cce 7c97e178 00000000 006ff238 > ntdll!RtlEnterCriticalSection+0x46 > 0012fb34 7c80b4af 00000001 00000000 0012fb70 ntdll!LdrLockLoaderLock+0xea > 0012fb8c 7c80b5ad 00400000 02c5cdf8 00000104 > kernel32!GetModuleFileNameW+0x89 > 0012fbb8 73e69cc8 00400000 0012fcdc 00000104 > kernel32!GetModuleFileNameA+0x4b > 0012fee8 73e69c67 006ff238 73ddcf57 00400000 > mfc42!CWinApp::SetCurrentHandles+0x45 > 0012fef0 73ddcf57 00400000 00000000 00152ffc mfc42!AfxWinInit+0x4f > 0012ff10 0065f893 00400000 00000000 00152ffc mfc42!AfxWinMain+0x2c > 0012ff24 0056f6f2 00400000 00000000 00152ffc MyApp!WinMain+0x15 > [appmodul.cpp @ 30] > 0012ffc0 7c817077 80000001 0145e0c4 7ffdd000 MyApp!WinMainCRTStartup+0x134 > 0012fff0 00000000 0056f5be 00000000 78746341 > kernel32!BaseProcessStart+0x23 > > 1 Id: 1f0.8c8 Suspend: 1 Teb: 7ffde000 Unfrozen > ChildEBP RetAddr Args to Child > 02a0f650 7c90df4a 7c8648a2 00000002 02a0f838 ntdll!KiFastSystemCallRet > 02a0f654 7c8648a2 00000002 02a0f838 00000001 > ntdll!ZwWaitForMultipleObjects+0xc > 02a0fa08 77c32f0f 02a0fa50 00000000 00000000 > kernel32!UnhandledExceptionFilter+0x8b9 > 02a0fa24 77c3a3c7 00000000 02a0fa50 77c35cf5 msvcrt!_XcptFilter+0x161 > 02a0fa30 77c35cf5 02a0fa58 00000000 02a0fa58 msvcrt!_endthreadex+0xc0 > 02a0fa58 7c9032a8 02a0fb44 02a0ffa4 02a0fb60 msvcrt!_except_handler3+0x61 > 02a0fa7c 7c90327a 02a0fb44 02a0ffa4 02a0fb60 ntdll!ExecuteHandler2+0x26 > 02a0fb2c 7c90e48a 00000000 02a0fb60 02a0fb44 ntdll!ExecuteHandler+0x24 > 02a0fb2c 77c47631 00000000 02a0fb60 02a0fb44 > ntdll!KiUserExceptionDispatcher+0xe > 02a0fe2c 73e68a0e 02c72fe0 00000000 00000024 msvcrt!memset+0x41 > 02a0fe48 73e682ca 00000001 02c6eee8 00000000 > mfc42!CThreadSlotData::SetValue+0xb9 > 02a0fe5c 73e6842a 73e688db 00000003 02171891 > mfc42!CThreadLocalObject::GetData+0x55 > 02a0fe68 02171891 021752e8 00000003 80284006 > mfc42!AFX_MAINTAIN_STATE2::AFX_MAINTAIN_STATE2+0x14 > 02a0fe80 02171a7b 02170000 00000003 00000000 MyDll!DllMain+0x102 > [dllmodul.cpp @ 169] > 02a0fea0 7c90118a 02170000 00000003 00000000 MyDll!_DllMainCRTStartup+0x50 > 02a0fec0 7c913a43 02171a2b 02170000 00000003 > ntdll!LdrpCallInitRoutine+0x14 > 02a0ff38 7c80c136 7c968f03 02786f78 026fff78 ntdll!LdrShutdownThread+0xd7 > 02a0ff70 77c3a33b 00000000 02786f78 02a0ffb4 kernel32!ExitThread+0x3e > 02a0ff80 77c3a3b5 00000000 7c968f03 02695000 msvcrt!_endthreadex+0x34 > 02a0ffb4 7c80b729 026fff78 7c968f03 02695000 msvcrt!_endthreadex+0xae > > 2 Id: 1f0.8e8 Suspend: 1 Teb: 7ffdc000 Unfrozen > ChildEBP RetAddr Args to Child > 0300fc0c 7c90df5a 7c91b24b 00000064 00000000 ntdll!KiFastSystemCallRet > 0300fc10 7c91b24b 00000064 00000000 00000000 > ntdll!ZwWaitForSingleObject+0xc > 0300fc98 7c901046 0197e178 7c91e3b5 7c97e178 > ntdll!RtlpWaitForCriticalSection+0x132 > 0300fca0 7c91e3b5 7c97e178 0300fd2c 00000004 > ntdll!RtlEnterCriticalSection+0x46 > 0300fd18 7c90e457 0300fd2c 7c900000 00000000 ntdll!_LdrpInitialize+0xf0 > 00000000 00000000 00000000 00000000 00000000 ntdll!KiUserApcDispatcher+0x7 > > All 3 threads were created about 30 seconds ago, but consumed hardly any > CPU > time. And it seems as the application is shutting down. > 0:001> !runaway 7 > User Mode Time > Thread Time > 2:8e8 0 days 0:00:00.000 > 1:8c8 0 days 0:00:00.000 > 0:810 0 days 0:00:00.000 > Kernel Mode Time > Thread Time > 0:810 0 days 0:00:00.140 > 1:8c8 0 days 0:00:00.031 > 2:8e8 0 days 0:00:00.000 > Elapsed Time > Thread Time > 0:810 0 days 0:00:29.746 > 1:8c8 0 days 0:00:29.590 > 2:8e8 0 days 0:00:29.511 > > NT loader lock is owned by thread 1. Threads 0 and 2 are blocked on the NT > loader lock's critical section. > 0:001> !cs -l > ----------------------------------------- > DebugInfo = 0x7c97e1a0 > Critical section = 0x7c97e178 (ntdll!LdrpLoaderLock+0x0) > LOCKED > LockCount = 0x2 > OwningThread = 0x000008c8 > RecursionCount = 0x1 > LockSemaphore = 0x64 > SpinCount = 0x00000000 > > So, an access violation takes place within thread 1's context. The access > violation occurs while setting memory block 02c72fe0 to 0 with size of 36 > bytes. However, the memory block in question was allocated 4 bytes less > than > what memset is using, which is why we get access violation. > > 0:001> !heap -p -a 02c72fe0 > address 02c72fe0 found in > _DPH_HEAP_ROOT @ 141000 > in busy allocation ( DPH_HEAP_BLOCK: UserAddr UserSize > - VirtAddr VirtSize) > 142b24: 2c72fe0 20 > - 2c72000 2000 > 7c918f21 ntdll!RtlAllocateHeap+0x00000e64 > 7c809a7f kernel32!LocalAlloc+0x00000058 > 73e689d7 mfc42!CThreadSlotData::SetValue+0x00000082 > 02171a7b MyDll!_DllMainCRTStartup+0x00000050 > 7c90118a ntdll!LdrpCallInitRoutine+0x00000014 > 7c913a43 ntdll!LdrShutdownThread+0x000000d7 > 7c80c136 kernel32!ExitThread+0x0000003e > 77c3a33b msvcrt!_endthreadex+0x00000034 > 77c3a3b6 msvcrt!_endthreadex+0x000000af > 7c80b729 kernel32!BaseThreadStart+0x00000037 > > > The disassembly below matches the code snippet from AFXTLS.cpp. > 73e689f0 e8c0d2fbff call mfc42!AfxThrowMemoryException (73e25cb5) > 73e689f5 8b4608 mov eax,dword ptr [esi+8] > 73e689f8 8b4f0c mov ecx,dword ptr [edi+0Ch] > 73e689fb 2bc8 sub ecx,eax > 73e689fd c1e102 shl ecx,2 > 73e68a00 51 push ecx > 73e68a01 8b4e0c mov ecx,dword ptr [esi+0Ch] > 73e68a04 8d0481 lea eax,[ecx+eax*4] > 73e68a07 53 push ebx > 73e68a08 50 push eax > 73e68a09 e8748ff6ff call mfc42!memset (73dd1982) > > > void CThreadSlotData::SetValue(int nSlot, void* pValue) > <snip> > if (pData->pData == NULL) > AfxThrowMemoryException(); > > // initialize the newly allocated part > memset(pData->pData + pData->nCount, 0, > (m_nMax - pData->nCount) * > sizeof(LPVOID)); > > So, either (pData->pData + pData->nCount) points to a wrong memory block > or > (m_nMax - pData->nCount) comes up with wrong byte count. > > Does anyone have any suggestions on how to track down this issue? > > Thank you, > Olegas
From: Olegas on 5 May 2010 15:01 Joseph, Thank you for the follow up. We enable full page heap for our application via the GFlags utility as follows: GFLAGS.exe /p /enable MyApp.exe /full Once we enabled full page heap, our application started crashing once in a while at our customer's location. Unfortunately, it is not reproducible on demand either at the customer's location or in the office under a debugger, which leads me to believe it has to be some type of timing related issue. In this particular dump, page heap's guard pages essentially prevented heap corruption from taking place during memset() execution. If it wasn't for the guard pages, memset() would happily overwrite 4 bytes past the end of allocated block and application would crash elsewhere down the road. Thank you, Olegas "Joseph M. Newcomer" wrote: > It looks to me like you are seeing a problem in synchronization. You have not shown that > any heap function is on the stack at this point. > > Note that heap corruption can occur at any time; when you get a message about heap > corruption, it is not saying "I have caused heap corruption" it is saying "I finally > discovered that some time in the unknown past, somebody has corrupted the heap". The > damage could be trillions of instructions in the past. > > While this is a good bug report, unfortunately it isn't much help in tracking down what > the correuption is or who did it. These still rank as the nastiest possible bugs to find > and fix. > > When you say "full page heap enabled" what are you doing to enable it? > > joe >
From: Olegas on 5 May 2010 15:08 Alexandre, Thank you for your reply. That is a good question. Unfortunately, all I have is a second chance dump collected via NTSD. I do not have matching AdPlus log that would show sequence of module load / unload events. Is there a way to learn that info from a crash dump? I'll follow your suggestion under a debugger and see if anything stands out. Thank you, Olegas "Alexandre Grigoriev" wrote: > Was Mydll loaded before the failing thread started or after? > > Set a breakpoint in MyDll!_DllMainCRTStartup in DLL_THREAD_DETACH handler > and see what's going on there. >
|
Next
|
Last
Pages: 1 2 Prev: How to list mapped drives in dialog running as service? Next: No stack shown while remote debugging a crash |