The Register® — Biting the hand that feeds IT

Feeds

Malware: Windows is only part of the problem

On coding secure and resilient applications

Regcast training : Hyper-V 3.0, VM high availability and disaster recovery

We’ve all been hearing a lot about secure applications recently, or more accurately about insecure applications; specifically those that are exploited in identity theft raids or that we can be “tricked” into running on our PCs.

Insecure applications are such a problem that Microsoft has spent the last five years and many millions of dollars re-engineering its operating system and much of its other software in order to improve the situation [and can one ever really overcome the temptation to bolt-on security to a fundamentally insecure design, in pursuit of “backwards compatibility”, in such circumstances – Ed].

Other software providers are doing the same thing and there has been an explosion of anti-virus and spyware removal vendors in the industry. It’s not that software has suddenly become insecure, rather with the internet there is now a viable means for criminals to exploit these insecurities to create ill-gotten gains.

It’s estimated that there are 15 million instances of identity theft each year. Many of these identities are stolen in direct attacks against e-commerce websites, universities, government systems but others are a result of malware installed on our home computers recording our keystrokes.

According to the Microsoft security intelligence report (available here) we saw the emergence of over 40,000 new trojans and 30,000 password loggers specifically designed to steal identity information for online banking in the first half of 2006. In the same period, the Microsoft malicious software removal tool disinfected almost 10,000,000 computers.

With these figures, it’s fairly safe to say it’s a significant problem; however, there are a number of misunderstandings about the cause. Frequently, the tendency is to blame the operating system and in Microsoft’s own words (here [2Mb Powerpoint]) “Windows 98 clients cannot be effectively secured”.

However, it really isn’t just a question of operating system shortcomings, it’s often the applications and the services running on the operating system that provide the open backdoors to malware - and the operating system simply can’t stop them.

The worst example of this is the now defunct Kazaa, which was software explicitly designed to mislead the user about its true function - while pretending to provide p2p functions it secretly installed spyware and adware all over our PCs.

Clearly, nobody should trust software coming from an unknown source with unknown motives; this was the lesson one should have learnt from Kazaa and it’s an extreme one. Nowadays, however, malware finds its way into our systems through security holes even in application software that was designed and implemented honestly.

The classic example of such a backdoor is a buffer overflow attack arising from malicious input. In the early days, a URL with over 256 characters could cause Internet Explorer to execute arbitrary code. The arbitrary code is chosen by the attacker and almost certainly, the payload will be either malware, or something that installs malware.

Another example of a real vulnerability is a jpg decoder that allows arbitrary code to be executed when decoding an incorrectly formatted image. All the attacker had to do was to place such an image on a website or send it in an email, your browser would try to load it and his/her code would be executed.

However, the point here isn’t these specific issues (which shouldn’t be a problem any longer, as long as you've patched your operating system and applications up-to-date) but that the applications rather than the operating system are now the entry point for malware and that the vulnerabilities can arise in subtle scenarios. Who would have thought that simply looking at a picture could leave your computer wide open to an attacker?

Buffer overflow attacks have been around since the early days and are possible because of how programs execute in computers such as the PC running Windows, which have stack-based architectures. There’s nothing particularly clever about them but new code continues to be vulnerable; it’s something hackers are good at finding and, unfortunately, us good guys haven’t the best record at preventing them. In this series of articles, we want to look at how this and other attacks work and to highlight what can be done to make our code less vulnerable.

So how does the buffer overflow work? On a windows PC, programs execute by starting in the main function, and calling other functions to perform calculations. As each function is called it pushes its stack frame onto the stack. This stack frame consists of local variables, and housekeeping information for the operating system; including the stack base pointer of the calling function and the return address. The last part is particularly important for the buffer overflow attack; after the function returns the next instruction to be executed is that pointed to by the return address.

In C and C++ programs, coders allocate arrays to fixed sizes appropriate for the expected, valid input. In a buffer overflow attack, an attacker provides input that is deliberately malformed so that more data is assigned to the array than it can handle and as a result, the return address mentioned above is overwritten.

When the function returns the system continues execution at the address specified by the attacker, and we can assume that it will point to code that we don’t want executing on our system; and, of course, since malware coders are no better at testing their code than anyone else, and it’s quite hard to test this stuff in practice anyway, at best your system may crash nastily.

The following example shows how this works with well-formatted parameters. You’ll see that bar is never called; however a buffer overflow in foo causes the code of bar to be executed nonetheless.

Buffer Overflow example

namespace {
  void bar() { 
    printf("hello world");
    exit(0); 
  }

  void foo(char *psPtr) {
    char vsBuf[8] = "ABCDEFG";
    strcpy(vsBuf, psPtr);     // [1] 
  }
}

In this example, the code that calls foo is located just before memory address ’00 40 11 46’ and the code of bar is located at ’00 40 10 30’. These addresses appear on the stack in reverse byte order because our test system is little endian. (00 40 11 46 is seen below as 46 11 40 00).

We stop execution at [1] and examine the stack. The local variable containing “ABCDEFG” is clearly visible followed by a pointer (not important; but it’s the stack base of the calling function). After this we see the value ’46 11 40 00’ – this is the return address of the function if you read it in reverse.

0012FF02  CC CC CC CC CC CC  ÌÌÌÌÌÌ
0012FF08  CC CC CC CC 41 42  ÌÌÌÌAB
0012FF0E  43 44 45 46 47 00  CDEFG.
0012FF14  80 FF 12 00 46 11  .ÿ..F.
0012FF1A  40 00 6C FF 12 00  @.lÿ..

We now execute the string copy.

int main() {
  char myString[20] = {
    'Z',  'Z',  'Z',  'Z',  'Z',  'Z',  'Z', 'Z', 
    0x80, 0xFF, 0x12, 0x10, 0x30, 0x10, 0x40, 0x00,
  }; 

  foo(myString); 
  return 0; 
} 

After executing the string copy, we see that the function return address has been overwritten. When the function returns, execution will now continue at the location of bar

0012FF02  CC CC CC CC CC CC  ÌÌÌÌÌÌ
0012FF08  CC CC CC CC 5A 5A  ÌÌÌÌZZ
0012FF0E  5A 5A 5A 5A 5A 5A  ZZZZZZ
0012FF14  80 FF 12 10 30 10  .ÿ..0.
0012FF1A  40 00 6C FF 12 00  @.lÿ.. 

Finding exploits

But how do hackers find code like this that they can exploit? Well, the applications that we run are freely available. All a hacker has to do is take the application and start feeding it random data or corrupted files as input. In some cases, this data will cause crashes and a percentage of these crashes are exploitable. The specific corrupted data that caused the crash becomes part of the hacker's analysis, in the process of delivering a new exploit. In other cases, where the source code is available, the attacker can analyze the code and find security holes. In any event, it’s not safe to assume that insecure code will never be exploited because it will never be found; it does get found, using methods like those described here and others — the statistics from the first paragraph prove it.

We know that organized crime is moving into technical areas and that there’s a black market for exploits that can be used to install malware on your PC (see here for example). And “zero day exploits” (that is, exploits for which no patch yet exists) for Windows Vista are reported here to be on sale for upwards of €10000.

Why would crime pay this kind of money? Well one working exploit works against every machine running the vulnerable software, at least until a patch is developed and installed. A subverted machine can bring revenue in many different ways; participating in botnets, sending spam, collecting identity information on its users, the list goes on…

So faced with this kind of well-resourced and determined attacker, what can developers do? As always, awareness helps: developers who are aware of the problem and its causes are less likely to make the mistake.

Additionally, code analysis tools such as those from Fortify and Programming Research exist that compile source code and produce analyses that enable security problems to be identified and some C and C++ compilers contain features that provide a level of protection from these vulnerabilities.

In VC7 this called Buffer Security Protection. Unfortunately, the implementation does not cover all scenarios, and for the scenarios it does cover questions have been raised to its effectiveness here. However, the idea of increasingly using tools to ensure secure software is a good one; and, hopefully, future evolutions of this will be more complete.

Microsoft has produced a website describing how software development process can be improved to produce more secure code and in upcoming articles, we’ll look at some of these, including threat modelling and abuse cases. We’ll also look at some of the other security holes that can creep into software that can be exploited through attacks such as injected SQL or cross site scripting. In the meantime, give some thought to disabling some of the services and applications you don’t use! ®

Agentless Backup is Not a Myth

Latest Comments

Followup regarding separate stacks

I'm not familiar with the 80xxx instruction set (though I am with a number of others) but I'm not fully convinced by the backward compatibility argument. For that to hold, code would have to be doing something very direct, such as an explicit "load the last address in the stack frame into the program counter" in order to do a subroutine return. In fact most ISAs have a seperate "RTS" type instruction, which does this operation implicitly. If that's the case, the internal implementation of the RTS is not subject to backward compatibility constraint - it can load an address obtained from any stack into the program counter. Stack frame alignment might be subject to compatibility - in which case such a processor can simply push an unused value onto the stack frame as a placeholder.

0
0

Separate code and data stacks

" why don't CPU manufacturers simply keep the data and code/return address stacks separate?" -- Graham

This actually isn't a bad idea. Some processor architectures (specifically, ones with a highly orthogonal instruction set such as the PDP-11, ARM, MIPS, 68000 and PowerPC families) could already support this without much effort. You just need to have two instructions; "put the contents of register Y into the address pointed to by register X and decrement register X" and "increment register Y and get the contents of the address pointed to by register Y into register X". Then use any two registers as a "data stack pointer" and "execution stack pointer" respectively. Obviously, you would need to keep the data stack at a lower address in memory than the execution stack so the former cannot grow into the latter.

It's still not perfect, because code could still deliberately modify the execution stack -- or directly alter the data stack pointer to point into the execution stack. But such code *wouldn't* be able to get itself executed by anything so trivial as a stack overflow.

As for why it's not done; well, I suppose that all dates back to the 8080 which had only one stack pointer, implemented as a simple up/down counter, and corresponding dedicated PUSH and POP instructions. The 8086 implemented the 8080 instruction set, and kept the single stack. Everything since then, all the way to the Core 2 Duo, has carried 8086 -- and therefore 8080 -- legacy baggage. We can't get away from the 80x86 instruction set as long as anyone has 80x86 binaries they need to run.

The fact remains that the BEST way to make sure code doesn't contain exploits is just never to run any software whose Source Code has not been audited by independent (i.e., not connected with the author) experts. Auditing of source code, possibly in conjunction with provision of patches for any bugs discovered, would seem to be a service which has Intrinsic Value (i.e., if you are a programmer, you could make money doing this for people).

Of course, this approach is somewhat incompatible with Microsoft's (and others') business model, where they keep the Source Code a jealously-guarded secret *even from the people who are using the software* ! They've been getting away with that for long enough now that most people don't even know what Source Code *is*, or why it's important to them. Access to Source Code is a prerequisite not only for auditing software for vulnerabilities (bear in mind that, for every dishonest hacker looking for something they can exploit, there are several honest hackers looking for the same problems with the intent to release a patch and cure them ..... it's a matter of definition that good guys outnumber bad guys), but also for adapting the software you use to suit the way you do business. Otherwise, you would have to adapt the way you do business to suit the software you use.

0
0

Mind your language

Graham - there have been CPUs designed with increased security in mind, but the main issue is backward compatibility, which is the issue that continually dogs Windows. It's no good having trusted computing when 80% of the software you use needs to be run in untrusted mode (and half your hardware drivers are by 'unknown' and unsigned - look at your Windows services to see what I mean).

Steven - application development would be a lot quicker, and safer, if we could trust our programming languages to do what they appear to say. The choice of C++ as the major application - rather than system - programming language is a problem in itself. The programming community has continually rejected safer languages (eg. ADA) in favour of something powerful but unsafe.

There were - of course - other pragmatic reasons - runtime safety checking every variable assignment against type declaration is a performance hit. Which is why C++ slaughtered higher level OO languages.

It's worth reading Wirth's paper on a history of good / bad ideas in programming.

http://www.cs.inf.ethz.ch/~wirth/Articles/GoodIdeas_origFig.pdf

Lastly, however, some blame does still have to go to Windows itself - an application may open a back door, but it should have been a lot harder to download and execute an application without the users consent, and near impossible to modify the system directories. Vista thankfully takes us closer to this point.

0
0

More from The Register

 breaking news
Number of cops abusing Police National Computer access on the rise
Only a telegram from the Queen can get you off it
 breaking news
NSA PRISM snoop-gate: Won't someone think of the children, wails Apple
10,000 things probed, mostly about missing kids, Alzheimer patients, we're told
 breaking news
NSA PRISM-gate: Relax, GCHQ spooks 'keep us safe', says Cameron
Whatever they are up to, it's all above board, we're told
PRISM snitch claims NSA hacked Chinese targets since 2009
Snowden suddenly looks safer in Hong Kong after revelations
 breaking news
US chief spook: Look, we only want to spy on 6.66 BEELLLION of you
Americans assured they are not in the NSA's sights
Flash flaw potentially makes every webcam or laptop a PEEPHOLE
But it's a Google problem - Chrome only, insists Adobe
Speech-to-text drives motorists to distraction
Will talking to you mean I crash into that car up ahead, Siri?
DHS warns of vulns in hospital medical equipment
Has your doctor's anasthesia machine been hacked?
 breaking news
'BadNews is malware' says outfit that found it
Google says code harmless but Lookout says code base is evolving
Panda-peddlers cuffed for chess gambling gambit
More porridge on the menu for Chinese coders after second offence