This article is more than 1 year old
Reverse engineering Apple's OS X
A thunking good time
The Wrinkle
There's one wrinkle that you need to be aware of, especially when looking at framework code. In order to implement position-independent code, the GCC compiler implements a "thunking" technique that provides register-based access to global variables, selector strings, and so forth. Thunk calls always occur at the beginning of a method. A typical method prologue might look like this:
+(void)[SomeClass SomeMethod] 0000fa60 55 pushl %ebp 0000fa61 89e5 movl %esp,%ebp 0000fa63 57 pushl %edi 0000fa64 56 pushl %esi 0000fa65 53 pushl %ebx 0000fa66 e800000000 calll 0x0000fa6b 0000fa6b 5b popl %ebx
If you look carefully at the above, you'll see that the function call at address 0000FA66 simply calls the very next address. In other words, execution continues at address 0000FA6B, but with the same address on the stack.
Popping this return address into EBX gives a position-independent offset that can be used to access global variables. For every EBX-relative data reference in this method (sometimes ECX or even EAX is used), we need to add the "thunk offset" (0000FA6B in this case) to determine the absolute address of the data that's being accessed.
This makes it more tedious to figure out which global variables, string constants or selectors are being accessed. Fortunately, though, otx gets it right much of the time. For those times when otx gets it wrong, I've written a little utility for myself - a kind of otool "after-after-burner" - that processes the output of otx and massages every thunked address reference, so that the absolute address is always shown in the disassembly. It also recognizes and correctly handles old-style thunk calls such as "___i686.get_pc_thunk".
Tick the three Objective-C check boxes when reverse engineering with otx
There's one last weapon in my reverse-engineering armory that definitely deserves a mention, and that's IDA Pro. Currently available here, IDA Pro is a sophisticated multi-processor disassembler that's equally at home munching away at x86 code (my preferred MO), PowerPC or even iPhone ARM executables. Using the output of IDA Pro, it's very easy - for example - to see how a specific function call is actually implemented inside some external dylib or framework.
With all these tools in your arsenal, reverse engineering Cocoa executables is actually very simple. In fact, it's a good deal more straightforward than most Windows executables, with the exception of Delphi and .NET where - like Cocoa - a good deal of runtime type information is contained within the executable. ®