Kernel modules

Kernel modules This section cover the kernel modules. As already stated, Wine implements the NT architecture, hence provides NTDLL for the core kernel functions, and KERNEL32, which is the implementation of the basis of the Win32 subsystem, on top of NTDLL. NTDLL NTDLL provides most of the services you'd expect from a kernel. Process and thread management are part of them (even if process management is still mainly done in KERNEL32, unlike NT). A Windows process runs as a Unix process, and a Windows thread runs as a Unix thread. Wine also provide fibers (which is the Windows name of co-routines). Most of the Windows memory handling (Heap, Global and Local functions, virtual memory...) are easily mapped upon their Unix equivalents. Note the NTDLL doesn't know about 16 bit memory, which is only handled in KERNEL32/KRNL386.EXE (and also the DOS routines). File management Wine uses some configuration in order to map Windows filenames (either defined with drive letters, or as UNC names) to the unix filenames. Wine also uses some incantation so that most of file related APIs can also take full unix names. This is handy when passing filenames on the command line. File handles can be waitable objects, as Windows define them. Asynchronous I/O is implemented on file handles by queueing pseudo APC. They are not real APC in the sense that they have the same priority as the threads in the considered process (while APCs on NT have normally a higher priority). These APCs get called when invoking Wine server (which should lead to correct behavior when the program ends up waiting on some object - waiting always implies calling Wine server). FIXME: this should be enhanced and updated to latest work on FS. Synchronization Most of the synchronization (between threads or processes) is done in Wine server, which handles both the waiting operation (on a single object or a set of objects) and the signaling of objects. Module (DLL) loading Wine is able to load any NE and PE module. In all cases, the module's binary code is directly executed by the processor. Device management Wine allows usage a wide variety of devices: Communication ports are mapped to Unix communication ports (if they have sufficient permissions). Parallel ports are mapped to Unix parallel ports (if they have sufficient permissions). CDROM: the Windows device I/O control calls are mapped onto Unix ioctl(). Some Win9x VxDs are supported, by rewriting some of their internal behavior. But this support is limited. Portable programs to Windows NT shouldn't need them. Wine will not support native VxD. Multi-threading in Wine This section will assume you understand the basics of multithreading. If not there are plenty of good tutorials available on the net to get you started. Threading in Wine is somewhat complex due to several factors. The first is the advanced level of multithreading support provided by Windows - there are far more threading related constructs available in Win32 than the Linux equivalent (pthreads). The second is the need to be able to map Win32 threads to native Linux threads which provides us with benefits like having the kernel schedule them without our intervention. While it's possible to implement threading entirely without kernel support, doing so is not desirable on most platforms that Wine runs on. Threading support in Win32 Win32 is an unusually thread friendly API. Not only is it entirely thread safe, but it provides many different facilities for working with threads. These range from the basics such as starting and stopping threads, to the extremely complex such as injecting threads into other processes and COM inter-thread marshalling. One of the primary challenges of writing Wine code therefore is ensuring that all our DLLs are thread safe, free of race conditions and so on. This isn't simple - don't be afraid to ask if you aren't sure whether a piece of code is thread safe or not! Win32 provides many different ways you can make your code thread safe however the most common are critical section and the interlocked functions. Critical sections are a type of mutex designed to protect a geographic area of code. If you don't want multiple threads running in a piece of code at once, you can protect them with calls to EnterCriticalSection and LeaveCriticalSection. The first call to EnterCriticalSection by a thread will lock the section and continue without stopping. If another thread calls it then it will block until the original thread calls LeaveCriticalSection again. It is therefore vitally important that if you use critical sections to make some code thread-safe, that you check every possible codepath out of the code to ensure that any held sections are left. Code like this: if (res != ERROR_SUCCESS) return res; is extremely suspect in a function that also contains a call to EnterCriticalSection. Be careful. If a thread blocks while waiting for another thread to leave a critical section, you will see an error from the RtlpWaitForCriticalSection function, along with a note of which thread is holding the lock. This only appears after a certain timeout, normally a few seconds. It's possible the thread holding the lock is just being really slow which is why Wine won't terminate the app like a non-checked build of Windows would, but the most common cause is that for some reason a thread forgot to call LeaveCriticalSection, or died while holding the lock (perhaps because it was in turn waiting for another lock). This doesn't just happen in Wine code: a deadlock while waiting for a critical section could be due to a bug in the app triggered by a slight difference in the emulation. Another popular mechanism available is the use of functions like InterlockedIncrement and InterlockedExchange. These make use of native CPU abilities to execute a single instruction while ensuring any other processors on the system cannot access memory, and allow you to do common operations like add/remove/check a variable in thread-safe code without holding a mutex. These are useful for reference counting especially in free-threaded (thread safe) COM objects. Finally, the usage of TLS slots are also popular. TLS stands for thread-local storage, and is a set of slots scoped local to a thread which you can store pointers in. Look on MSDN for the TlsAlloc function to learn more about the Win32 implementation of this. Essentially, the contents of a given slot will be different in each thread, so you can use this to store data that is only meaningful in the context of a single thread. On recent versions of Linux the __thread keyword provides a convenient interface to this functionality - a more portable API is exposed in the pthread library. However, these facilities is not used by Wine, rather, we implement Win32 TLS entirely ourselves. SysLevels SysLevels are an undocumented Windows-internal thread-safety system. They are basically critical sections which must be taken in a particular order. The mechanism is generic but there are always three syslevels: level 1 is the Win16 mutex, level 2 is the USER mutex and level 3 is the GDI mutex. When entering a syslevel, the code (in dlls/kernel/syslevel.c) will check that a higher syslevel is not already held and produce an error if so. This is because it's not legal to enter level 2 while holding level 3 - first, you must leave level 3. Throughout the code you may see calls to _ConfirmSysLevel() and _CheckNotSysLevel(). These functions are essentially assertions about the syslevel states and can be used to check that the rules have not been accidentally violated. In particular, _CheckNotSysLevel() will break (probably into the debugger) if the check fails. If this happens the solution is to get a backtrace and find out, by reading the source of the wine functions called along the way, how Wine got into the invalid state. POSIX threading vs kernel threading Wine runs in one of two modes: either pthreads (posix threading) or kthreads (kernel threading). This section explains the differences between them. The one that is used is automatically selected on startup by a small test program which then execs the correct binary, either wine-kthread or wine-pthread. On NPTL-enabled systems pthreads will be used, and on older non-NPTL systems kthreads is selected. Let's start with a bit of history. Back in the dark ages when Wines threading support was first implemented a problem was faced - Windows had much more capable threading APIs than Linux did. This presented a problem - Wine works either by reimplementing an API entirely or by mapping it onto the underlying systems equivalent. How could Win32 threading be implemented using a library which did not have all the neeed features? The answer, of course, was that it couldn't be. On Linux the pthreads interface is used to start, stop and control threads. The pthreads library in turn is based on top of so-called "kernel threads" which are created using the clone(2) syscall. Pthreads provides a nicer (more portable) interface to this functionality and also provides APIs for controlling mutexes. There is a good tutorial on pthreads available if you want to learn more. As pthreads did not provide the necessary semantics to implement Win32 threading, the decision was made to implement Win32 threading on top of the underlying kernel threads by using syscalls like clone directly. This provided maximum flexibility and allowed a correct implementation but caused some bad side effects. Most notably, all the userland Linux APIs assumed that the user was utilising the pthreads library. Some only enabled thread safety when they detected that pthreads was in use - this is true of glibc, for instance. Worse, pthreads and pure kernel threads had strange interactions when run in the same process yet some libraries used by Wine used pthreads internally. Throw in source code porting using WineLib - where you have both UNIX and Win32 code in the same process - and chaos was the result. The solution was simple yet ingenius: Wine would provide its own implementation of the pthread library inside its own binary. Due to the semantics of ELF symbol scoping, this would cause Wines own implementations to override any implementation loaded later on (like the real libpthread.so). Therefore, any calls to the pthread APIs in external libraries would be linked to Wines instead of the systems pthreads library, and Wine implemented pthreads by using the standard Windows threading APIs it in turn implemented itself. As a result, libraries that only became thread-safe in the presence of a loaded pthreads implementation would now do so, and any external code that used pthreads would actually end up creating Win32 threads that Wine was aware of and controlled. This worked quite nicely for a long time, even though it required doing some extremely un-kosher things like overriding internal libc structures and functions. That is, it worked until NPTL was developed at which point the underlying thread implementation on Linux changed dramatically. The fake pthread implementation can be found in loader/kthread.c, which is used to produce to wine-kthread binary. In contrast, loader/pthread.c produces the wine-pthread binary which is used on newer NPTL systems. NPTL is a new threading subsystem for Linux that hugely improves its performance and flexibility. By allowing threads to become much more scalable and adding new pthread APIs, NPTL made Linux competitive with Windows in the multi-threaded world. Unfortunately it also broke many assumptions made by Wine (as well as other applications such as the Sun JVM and RealPlayer) in the process. There was, however, some good news. NPTL made Linux threading powerful enough that Win32 threads could now be implemented on top of pthreads like any other normal application. There would no longer be problems with mixing win32-kthreads and pthreads created by external libraries, and no need to override glibc internals. As you can see from the relative sizes of the loader/kthread.c and loader/pthread.c files, the difference in code complexity is considerable. NPTL also made several other semantic changes to things such as signal delivery so changes were required in many different places in Wine. On non-Linux systems the threading interface is typically not powerful enough to replicate the semantics Win32 applications expect and so kthreads with the pthread overrides are used. The Win32 thread environment All Win32 code, whether from a native EXE/DLL or in Wine itself, expects certain constructs to be present in its environment. This section explores what those constructs are and how Wine sets them up. The lack of this environment is one thing that makes it hard to use Wine code directly from standard Linux applications - in order to interact with Win32 code a thread must first be "adopted" by Wine. The first thing Win32 code requires is the TEB or "Thread Environment Block". This is an internal (undocumented) Windows structure associated with every thread which stores a variety of things such as TLS slots, a pointer to the threads message queue, the last error code and so on. You can see the definition of the TEB in include/thread.h, or at least what we know of it so far. Being internal and subject to change, the layout of the TEB has had to be reverse engineered from scratch. A pointer to the TEB is stored in the %fs register and can be accessed using NtCurrentTeb() from within Wine code. %fs actually stores a selector, and setting it therefore requires modifying the processes local descriptor table (LDT) - the code to do this is in lib/wine/ldt.c. The TEB is required by nearly all Win32 code run in the Wine environment, as any wineserver RPC will use it, which in turn implies that any code which could possibly block (for instance by using a critical section) needs it. The TEB also holds the SEH exception handler chain as the first element, so if when disassembling you see code like this: movl %esp, %fs:0 ... then you are seeing the program set up an SEH handler frame. All threads must have at least one SEH entry, which normally points to the backstop handler which is ultimately responsible for popping up the all-too-familiar "This program has performed an illegal operation and will be terminated" message. On Wine we just drop straight into the debugger. A full description of SEH is out of the scope of this section, however there are some good articles in MSJ if you are interested. All Win32-aware threads must have a wineserver connection. Many different APIs require the ability to communicate with the wineserver. In turn, the wineserver must be aware of Win32 threads in order to be able to accurately report information to other parts of the program and do things like route inter-thread messages, dispatch APCs (asynchronous procedure calls) and so on. Therefore a part of thread initialization is initializing the thread serverside. The result is not only correct information in the server, but a set of file descriptors the thread can use to communicate with the server - the request fd, reply fd and wait fd (used for blocking). KERNEL Module FIXME: Needs some content... Consoles in Wine As described in the Wine User Guide's CUI section, Wine manipulates three kinds of "consoles" in order to support properly the Win32 CUI API. The following table describes the main implementation differences between the three approaches. Function consoles implementation comparison Function Bare streams Wineconsole & user backend Wineconsole & curses backend Console as a Win32 Object (and associated handles) No specific Win32 object is used in this case. The handles manipulated for the standard Win32 streams are in fact "bare handles" to their corresponding Unix streams. The mode manipulation functions (GetConsoleMode / SetConsoleMode) are not supported. Implemented in server, and a specific Winelib program (wineconsole) is in charge of the rendering and user input. The mode manipulation functions behave as expected. Implemented in server, and a specific Winelib program (wineconsole) is in charge of the rendering and user input. The mode manipulation functions behave as expected. Inheritance (including handling in CreateProcess of CREATE_DETACHED, CREATE_NEW_CONSOLE flags). Not supported. Every process child of a process will inherit the Unix streams, so will also inherit the Win32 standard streams. Fully supported (each new console creation will be handled by the creation of a new USER32 window) Fully supported, except for the creation of a new console, which will be rendered on the same Unix terminal as the previous one, leading to unpredictable results. ReadFile / WriteFile operations Fully supported Fully supported Fully supported Screen-buffer manipulation (creation, deletion, resizing...) Not supported Fully supported Partly supported (this won't work too well as we don't control (so far) the size of underlying Unix terminal APIs for reading/writing screen-buffer content, cursor position Not supported Fully supported Fully supported APIs for manipulating the rendering window size Not supported Fully supported Partly supported (this won't work too well as we don't control (so far) the size of underlying Unix terminal Signaling (in particular, Ctrl-C handling) Nothing is done, which means that Ctrl-C will generate (as usual) a SIGINT which will terminate the program. Partly supported (Ctrl-C behaves as expected, however the other Win32 CUI signaling isn't properly implemented). Partly supported (Ctrl-C behaves as expected, however the other Win32 CUI signaling isn't properly implemented).

The Win32 objects behind a console can be created in several occasions: When the program is started from wineconsole, a new console object is created and will be used (inherited) by the process launched from wineconsole. When a program, which isn't attached to a console, calls AllocConsole, Wine then launches wineconsole, and attaches the current program to this console. In this mode, the USER32 mode is always selected as Wine cannot tell the current state of the Unix console. Please also note, that starting a child process with the CREATE_NEW_CONSOLE flag, will end-up calling AllocConsole in the child process, hence creating a wineconsole with the USER32 backend. The Wine initialization process Wine has a rather complex startup procedure, so unlike many programs the best place to begin exploring the code-base is not in fact at the main() function but instead at some of the more straightforward DLLs that exist on the periphery such as MSI, the widget library (in USER and COMCTL32) etc. The purpose of this section is to document and explain how Wine starts up from the moment the user runs "wine myprogram.exe" to the point at which myprogram gets control. First Steps The actual wine binary that the user runs does not do very much, in fact it is only responsible for checking the threading model in use (NPTL vs LinuxThreads) and then invoking a new binary which performs the next stage in the startup sequence. See the threading chapter for more information on this check and why it's necessary. You can find this code in loader/glibc.c. The result of this check is an exec of either wine-pthread or wine-kthread, potentially (on Linux) via the preloader. We need to use separate binaries here because overriding the native pthreads library requires us to exploit a property of ELF symbol fixup semantics: it's not possible to do this without starting a new process. The Wine preloader is found in loader/preloader.c, and is required in order to impose a Win32 style address space layout upon the newly created Win32 process. The details of what this does is covered in the address space layout chapter. The preloader is a statically linked ELF binary which is passed the name of the actual Wine binary to run (either wine-kthread or wine-pthread) along with the arguments the user passed in from the command line. The preloader is an unusual program: it does not have a main() function. In standard ELF applications, the entry point is actually at a symbol named _start: this is provided by the standard gcc infrastructure and normally jumps to __libc_start_main which initializes glibc before passing control to the main function as defined by the programmer. The preloader takes control direct from the entry point for a few reasons. Firstly, it is required that glibc is not initialized twice: the result of such behaviour is undefined and subject to change without notice. Secondly, it's possible that as part of initializing glibc, the address space layout could be changed - for instance, any call to malloc will initialize a heap arena which modifies the VM mappings. Finally, glibc does not return to _start at any point, so by reusing it we avoid the need to recreate the ELF bootstrap stack (env, argv, auxiliary array etc). The preloader is responsible for two things: protecting important regions of the address space so the dynamic linker does not map shared libraries into them, and once that is done loading the real Wine binary off disk, linking it and starting it up. Normally all this is done automatically by glibc and the kernel but as we intercepted this process by using a static binary it's up to us to restart the process. The bulk of the code in the preloader is about loading wine-[pk]thread and ld-linux.so.2 off disk, linking them together, then starting the dynamic linking process. One of the last things the preloader does before jumping into the dynamic linker is scan the symbol table of the loaded Wine binary and set the value of a global variable directly: this is a more efficient way of passing information to the main Wine program than flattening the data structures into an environment variable or command line parameter then unpacking it on the other side, but it achieves pretty much the same thing. The global variable set points to the preload descriptor table, which contains the VMA regions protected by the preloader. This allows Wine to unmap them once the dynamic linker has been run, so leaving gaps we can initialize properly later on. Starting the emulator The process of starting up the emulator itself is mostly one of chaining through various initializer functions defined in the core libraries and DLLs: libwine, then NTDLL, then kernel32. Both the wine-pthread and wine-kthread binaries share a common main function, defined in loader/main.c, so no matter which binary is selected after the preloader has run we start here. This passes the information provided by the preloader into libwine and then calls wine_init, defined in libs/wine/loader.c. This is where the emulation really starts: wine_init can, with the correct preparation, be called from programs other than the wine loader itself. wine_init does some very basic setup tasks such as initializing the debugging infrastructure, yet more address space manipulation (see the information on the 4G/4G VM split in the address space chapter), before loading NTDLL - the core of both Wine and the Windows NT series - and jumping to the __wine_process_init function defined in dlls/ntdll/loader.c This function is responsible for initializing the primary Win32 environment. In thread_init(), it sets up the TEB, the wineserver connection for the main thread and the process heap. See the threading chapter for more information on this. Finally, it loads and jumps to __wine_kernel_init in kernel32.dll: this is defined in dlls/kernel32/process.c. This is where the bulk of the work is done. The kernel32 initialization code retrieves the startup info for the process from the server, initializes the registry, sets up the drive mapping system and locale data, then begins loading the requested application itself. Each process has a STARTUPINFO block that can be passed into CreateProcess specifying various things like how the first window should be displayed: this is sent to the new process via the wineserver. After determining the type of file given to Wine by the user (a Win32 EXE file, a Win16 EXE, a Winelib app etc), the program is loaded into memory (which may involve loading and initializing other DLLs, the bulk of Wines startup code), before control reaches the end of __wine_kernel_init. This function ends with the new process stack being initialized, and start_process being called on the new stack. Nearly there! The final element of initializing Wine is starting the newly loaded program itself. start_process sets up the SEH backstop handler, calls LdrInitializeThunk which performs the last part of the process initialization (such as performing relocations and calling the DllMains with PROCESS_ATTACH), grabs the entry point of the executable and then on this line: ExitProcess( entry( peb ) ); ... jumps to the entry point of the program. At this point the users program is running and the API provided by Wine is ready to be used. When entry returns, the ExitProcess API will be used to initialize a graceful shutdown. Structured Exception Handling Structured Exception Handling (or SEH) is an implementation of exceptions inside the Windows core. It allows code written in different languages to throw exceptions across DLL boundaries, and Windows reports various errors like access violations by throwing them. This section looks at how it works, and how it's implemented in Wine. How SEH works SEH is based on embedding EXCEPTION_REGISTRATION_RECORD structures in the stack. Together they form a linked list rooted at offset zero in the TEB (see the threading section if you don't know what this is). A registration record points to a handler function, and when an exception is thrown the handlers are executed in turn. Each handler returns a code, and they can elect to either continue through the handler chain or it can handle the exception and then restart the program. This is referred to as unwinding the stack. After each handler is called it's popped off the chain. Before the system begins unwinding the stack, it runs vectored handlers. This is an extension to SEH available in Windows XP, and allows registered functions to get a first chance to watch or deal with any exceptions thrown in the entire program, from any thread. A thrown exception is represented by an EXCEPTION_RECORD structure. It consists of a code, flags, an address and an arbitrary number of DWORD parameters. Language runtimes can use these parameters to associate language-specific information with the exception. Exceptions can be triggered by many things. They can be thrown explicitly by using the RaiseException API, or they can be triggered by a crash (ie, translated from a signal). They may be used internally by a language runtime to implement language-specific exceptions. They can also be thrown across DCOM connections. Visual C++ has various extensions to SEH which it uses to implement, eg, object destruction on stack unwind as well as the ability to throw arbitrary types. The code for this is in dlls/msvcrt/except.c Translating signals to exceptions In Windows, compilers are expected to use the system exception interface, and the kernel itself uses the same interface to dynamically insert exceptions into a running program. By contrast on Linux the exception ABI is implemented at the compiler level (inside GCC and the linker) and the kernel tells a thread of exceptional events by sending signals. The language runtime may or may not translate these signals into native exceptions, but whatever happens the kernel does not care. You may think that if an app crashes, it's game over and it really shouldn't matter how Wine handles this. It's what you might intuitively guess, but you'd be wrong. In fact some Windows programs expect to be able to crash themselves and recover later without the user noticing, some contain buggy binary-only components from third parties and use SEH to swallow crashes, and still others execute priviledged (kernel-level) instructions and expect it to work. In fact, at least one set of APIs (the IsBad*Ptr series) can only be implemented by performing an operation that may crash and returning TRUE if it does, and FALSE if it doesn't! So, Wine needs to not only implement the SEH infrastructure but also translate Unix signals into SEH exceptions. The code to translate signals into exceptions is a part of NTDLL, and can be found in dlls/ntdll/signal_i386.c. This file sets up handlers for various signals during Wine startup, and for the ones that indicate exceptional conditions translates them into exceptions. Some signals are used by Wine internally and have nothing to do with SEH. Signal handlers in Wine run on their own stack. Each thread has its own signal stack which resides 4k after the TEB. This is important for a couple of reasons. Firstly, because there's no guarantee that the app thread which triggered the signal has enough stack space for the Wine signal handling code. In Windows, if a thread hits the limits of its stack it triggers a fault on the stack guard page. The language runtime can use this to grow the stack if it wants to. However, because a guard page violation is just a regular segfault to the kernel, that would lead to a nested signal handler and that gets messy really quick so we disallow that in Wine. Secondly, setting up the exception to throw requires modifying the stack of the thread which triggered it, which is quite hard to do when you're still running on it. Windows exceptions typically contain more information than the Unix standard APIs provide. For instance, a STATUS_ACCESS_VIOLATION exception (0xC0000005) structure contains the faulting address, whereas a standard Unix SIGSEGV just tells the app that it crashed. Usually this information is passed as an extra parameter to the signal handler, however its location and contents vary between kernels (BSD, Solaris, etc). This data is provided in a SIGCONTEXT structure, and on entry to the signal handler it contains the register state of the CPU before the signal was sent. Modifying it will cause the kernel to adjust the context before restarting the thread.