Delphi DLL Stability Fixes

Writing a DLL that will be injected into another processes address space in Delphi (or BCB for that matter) has been plagued with rumors of instability and doom. Most have had to venture back to Microsoft compilers to produce stable DLLs for this purpose. Are these accusations true? In short... yes. If the DLL is injected into a process that makes heavy use of threads, Explorer is a classic example, then then a Delphi DLL was not a viable choice for a production application. There were many rumors floating around the web and from Delphi experts and most were of the opinion the problem was in the way Delphi managed Thread Local Storage (TLS) in the DLL during initialization and shut down.

  I spent countless hours single stepping through a context menu extension for Explorer trying to get it to consistently reproduce an AV that would occasionally occur. Reports from the news group seems to point to the problem was more severe on networked environments. I concluded the same as the problem was more prevalent if I was connected to the Internet logged in as a network.

  I struggled with this for over 3 years. Finally I wanted to try to bring together my EasyNSE package but was reluctant to do so due to the instability problems. I had found one "patch" that seems to make the DLL's stable but that was not enough for me. I needed to understand the cause and not rely on a magic API call that "seemed" to fix the problem.

library MysteryHookFixSample;

begin
  DisableThreadLibraryCalls(hInstance)
end.

  The above code inserted into the library would seem to fix the problem, but why?

  A few months ago I again fired up Win98 (the easiest platform for me to reproduce the problem) and went after the root cause of the problem again. Apparently the alignment of the stars was such I was able to reproduce the AV consistently! At the time I was talking to Mathias from MadExcept fame. Mathias quickly tracked the problem down and found a simple solution that could be implemented outside of the Runtime Library (RTL). Here is his analysis:

  "I have found the bug in the RTL and I have a clean workaround. Let me
  explain the situation:

  DLL_PROCESS_ATTACH -> SysInit.InitProcessTLS -> SysInit.InitThreadTLS
  DLL_THREAD_ATTACH -> SysInit.InitThreadTLS
  DLL_THREAD_DETACH -> SysInit.ExitThreadTLS
  DLL_PROCESS_DETACH -> SysInit.ExitProcessTLS -> SysInit.ExitThreadTLS

  As you can see, DLL_XXX_ATTACH always ends up in "InitThreadTLS", while
  DLL_XXX_DETACH always ends up in "ExitThreadTLS". We can forget about
  "Init/ExitProcessTLS". Now let's go through some situations. For all of
  the following cases please note that the events are meant to be for the
  very same thread (that's important!!).

  (1)
  InitThreadTLS
  ExitThreadTLS
  -> perfectly fine (standard case)

  (2)
  ExitThreadTLS
  -> strange situation, but no problems

  (3)
  InitThreadTLS
  -> memory leak (but I think this situation will never occur)

  (4)
  InitThreadTLS
  InitThreadTLS
  ExitThreadTLS
  -> memory leak (but I think this situation will never occur)

  (5)
  InitThreadTLS
  ExitThreadTLS
  ExitThreadTLS
  -> LocalAlloc gets called twice for the same pointer

  Now what happens when doing CBT hooking in win98 is situation (5). The
  very same thread gets a DLL_PROCESS_ATTACH + DLL_THREAD_DETACH +
  DLL_PROCESS_DETACH event. And the end result is the Explorer crash. If
  you ask me, the Borland programmers didn't believe, that case (5) can
  happen - but it does. Now let's look at my patched "ExitThreadTLS"
  function. I just added one line: "

  So how do you fix it? When a process loads a DLL the RTL first gets a crack at the entry point of the DLL. It sets up some variables and allocates a Thread Local Storage slot for each thread or process that attaches to the DLL. This is an important point as there are only 64 TLS slots available so the call the DisableThreadLibraryCalls is still a good idea if you don't need to know about threads being attached to the DLL. Why does is this done in the RTL? I think it is to support the ThreadVar type in Delphi. Each thread can have a unique copy of data.

  During this initialization Delphi allows the DLL to setup a function that will be called every time a process or thread is attached or detached from the DLL.

library HookFixSample;

implementation

procedure DLLEntryProc(EntryCode: integer);
begin
  case EntryCode of
  DLL_PROCESS_DETACH:
  begin
  end;
  DLL_PROCESS_ATTACH:
  begin
  end;
  DLL_THREAD_ATTACH:
  begin
  end;
  DLL_THREAD_DETACH:
  begin
  end;
  end;
end;

begin
  DLLProc := @DLLEntryProc;
  // Since we are already in the Process Attache to get to this point we call the function
  // manually
  DLLEntryProc(DLL_PROCESS_ATTACH);
end.

  Now based on Mathias's debugging the problem occurs during Thread detaching so we can fix the problem there.

library HookFixSampleD6andD7;

implementation

procedure madPatch_ExitThreadTLS;
var
  p: Pointer;
begin
  if @TlsLast = nil then
  Exit;
  if TlsIndex <> -1 then
  begin
  p := TlsGetValue(TlsIndex);
  if p <> nil then
  begin
  // The RTL will check the TLS value fo nil so if we Free it first then
  // set it to nil when the RTL tries to free it will find it set to nil and
  // skip it
  LocalFree(Cardinal(p));
  TlsSetValue(TlsIndex, nil); // <- this fixes case (5), the RTL does not nil the value
  end;
  end;
end;

procedure DLLEntryProc(EntryCode: integer);
begin
  case EntryCode of
  DLL_PROCESS_DETACH:
  begin
  end;
  DLL_PROCESS_ATTACH:
  begin
  end;
  DLL_THREAD_ATTACH:
  begin
  end;
  DLL_THREAD_DETACH:
  begin
  madPatch_ExitThreadTLS;
  end;
  end;
end;

begin
  DLLProc := @DLLEntryProc;
  // Since we are already in the Process Attache to get to this point we call the function
  // manually
  DLLEntryProc(DLL_PROCESS_ATTACH);
end.

  This was all well and good until it was tried in D5. The RTL changed a bit between D5 and D6, likely do to Kylix, but for what ever reason it breaks the above fix.

  At this point everything is great and we are ready to create stable and robust COM and Hook DLLs right? Well if you are using D6 and greater yes, if using D5 or lower no. There is another problem with the RTL implementation. In these compilers the Floating Point Unit is not correctly setup when the DLL is initialized, we need to do it for the DLL.

library HookFixSampleD4-D7;

implementation

procedure madPatch_ExitThreadTLS;
var
  p: Pointer;
begin
  if @TlsLast = nil then
  Exit;
  if TlsIndex <> -1 then
  begin
  p := TlsGetValue(TlsIndex);
  if p <> nil then
  begin
  // The RTL will check the TLS value fo nil so if we Free it first then
  // set it to nil when the RTL tries to free it will find it set to nil and
  // skip it
  {$IFNDEF COMPILER_5_UP}
  // D5 and lower have already freed the TLS slot before calling this function
  // In these compilers we can't free the memory but we can nil it.
  LocalFree(Cardinal(p));
  {$ENDIF COMPILER_5_UP}
  TlsSetValue(TlsIndex, nil); // <- this fixes case (5), the RTL does not nil the value
  end;
  end;
end;

var
  // D5 Fixes this problem;
  {$IFNDEF COMPILER_5_UP}
  ControlWord: Word;
  {$ENDIF}

 

procedure DLLEntryProc(EntryCode: integer);
begin
  case EntryCode of
  DLL_PROCESS_DETACH:
  begin
  // D5 Fixes this problem;
  {$IFNDEF COMPILER_5_UP}
  Set8087CW(ControlWord);
  {$ENDIF}
  end;
  DLL_PROCESS_ATTACH:
  begin
  // D5 Fixes this problem;
  {$IFNDEF COMPILER_5_UP}
  Set8087CW($133f);
  {$ENDIF}
  end;
  DLL_THREAD_ATTACH:
  begin
  end;
  DLL_THREAD_DETACH:
  begin
  madPatch_ExitThreadTLS;
  end;
  end;
end;

begin
  DLLProc := @DLLEntryProc;
  // Since we are already in the Process Attache to get to this point we call the function
  // manually
  DLLEntryProc(DLL_PROCESS_ATTACH);
end.

  That's it. If this template is used to create your Hook or COM object DLL it will produce a DLL that is just as stable as one developed in a Microsoft compiler. Hopefully Borland will have this fixed in the next version of Delphi.

 


mustangpeak.net

  Last Modified on: