Build and manage large-scale C++ on Windows
DLLs versus shared objects
John Lakos wrote the book on Large-Scale C++ Software Design more than 10 years ago, but it remains a must read for any serious C++ developer today.
It doesn't go much into the language. For instance there isn't anything inside regarding dynamic casts and virtual inheritance. Neither will it tell you how to calculate the factorial at compile time using compile time recursive templates.
What it does, is talk about the very real issues faced when a project gets big and complicated, a realty in today's world of cloud computing, online services, distributed applications and data centers. These issues lie between the abstraction presented by language and the underlying hardware as complexities of scale and lead us to start subdividing the software into static libraries and shared objects.
This book's standing in the industry shows that the issues are taken seriously, are largely platform independent and should be understood. But what about the Windows-specific issues - DLLs versus shared objects?
DLLs on Windows are not 100 per cent analogous to shared objects in the Unix world. In a shared object all symbols are exported unless steps are taken to prevent this and, while a shared object knows its dependences in terms of external symbols and libraries, the linker doesn't give an error if a needed externally defined symbol is not present at link time.
Dynamic loaders and real code
If the same symbol occurs in different libraries, as can happen with templates, then warnings result at link time and the duplicate symbols are merged by the dynamic loader. Data is also managed differently. Symbols naming static data are exported like any other and multiple definitions are merged to unambiguously name a single piece of memory at runtime.
DLLs on the other hand have a different model. Symbols that are intended for external use must be marked so that the linker knows to put them in the export table. This is what the declaration specifier __declspec(dllexport) is for and there are some divergences from Unix shared-object behavior as a result of this difference.
External code finds the symbol via a corresponding import declaration and linking with the import library of the DLL. The import library is linked into a client-portable executable as any other static library, but when called its code will invoke the dynamic loader and call the real code via pointers to functions obtained from GetProcAddress. We'll see why this is important in a moment, for now just remember that code inside the DLL is not passing by this import library.
Internal code accesses its siblings, both exported and otherwise, by offset addresses relative to the DLL base address. This implies the first important difference, linking a DLL on Windows is a full blown link operation, all symbols are resolved and replaced with offset addresses and errors are given if something is not found. In fact when linking a DLL you are creating a real image in portable executable format and this takes considerably more time than linking a static library which is essentially just an archive.
Perhaps the book should be updated?
While this is an oldie-but-goodie, it might be good to update the book. C++ has morphed a bit, and I'm sure that the author has some more experience and things have changed just a bit.
While the book addresses general practicality of large-scale projects with C++, the author's article is about Windows .DLLs and apparently porting from Unix. Most of the issues would seem to be addressed by some forward, thought, also known as "design."
It would have been nice if the author would have allowed more than one paragraph for the book, instead of focusing on Windows .DLL files.
Why be huge?
Why make a single huge executable instead of using a zillion separate processes?
It's not like anybody's going to have a single CPU in their PC in even two year's time (not to mention the servers), so your choices are to build either insanely complex multithreaded beasts that will never have a fraction of their bugs dealt with or a huge number of different processes that can each be tested to spec, if not expectation.
Also avoid C++ entirely in favor of Java or some other CPU wasting scripist abomination, because the user can always just add a few dozen more CPU cores to their system.
Sorry, this article is completly biased. There is no standard way how UNIX handles shared libraries. The problems on Windows persist on UNIX, too.
It may be a correct comparing Windows with Linux.
The Author does not mention the namespace pollution through use of the pam modules on UNIX. For instance you have to be very carefully not to link with a private LDAP library because you get injected a ldap shared lib if pam_ldap is configured! This is a major blocker if you develop cross plattform software for us.
AIX shared libraries for instance behave very much like the Windows ones. For instance symbols have to be exported explicitly.
ELF shared libaries (Linux, HPUX ia64, Solaris) have a feature not to export every symbols, which prevents clashing of symbol names, this have to be
The author states that windows dll use a full link step. There is a not-to-be-used feature marking dll entries with numbers rather than symbol names, but this should now be used anymore.
IMHO UNIX shared libraries are evenly complex compared to windows ones. Have a look at HP-SUX PARisc 32 bit. Try to dlclose() a shared library: Your application surely will crash, because the do not implemented reference counting.