Confessions of a sysadmin

Original URL: https://www.theregister.com/2010/06/01/sysadmin_confessions/

I found a virus on my network today…

Posted in Systems, 1st June 2010 10:26 GMT

Blog I would like to say that it has been a few days since my last malware infected computer. I have been dealing with a string of these lately, and I’ve had quite enough of them for now, thank you.

I would also like to say my network was the epitome of configuration perfection, with every system fully patched, and a team of network ninjas facing off against hired pirates in a never ending battle for security perfection. The truth, however, is less ideal. My network has some systems that can’t ever be patched, and others where IT can’t force automatic patches. Configuration errors will inevitably exist due to a combination of lack of time, lack of knowledge or prioritization of IT tasks.

According to the email of article topics in my inbox, this one is supposed to be about the importance of proper configuration and patch management. Instead of being able to stand atop an ivory tower and reveal to you the secrets of perfect network management, I am forced to humble myself before the entire internet with a confession:

I discovered the Conficker worm on my network today.

I am shamed by this because the infection was entirely preventable and all the more because this discovery occurred the day this very article was due. While I had a lovely sermon prepared in which I would discuss why proper configuration and patch management are so very important, I think that doing a post-mortem on exactly how I contracted these bugs will be both far more entertaining, and perhaps even a little enlightening.

I discovered the infection today on Windows 2000 systems running Service Pack 4. Each system serves as a network and command and control interface for a large piece of equipment (think the size of a small car).

The hardware they are running on is fairly old (they talk to their attached equipment via a truly ancient SCSI card) and the software is remarkably picky and brittle. If installed exactly as directed, the computers (and their attached equipment) run just fine. Install the wrong windows update or change the wrong setting and they will refuse to work.

As an added bonus, the hardware specifications on the provided kit is so exact that if you were to (for example) load an anti-malware scanner on the system then the performance decrease would very negatively affect the productivity of the unit. Any decrease in output capacity of these units simply will not be tolerated.

The more I delve into the situation the more I am convinced these systems were infected a while ago. We had a user who opened an infected attachment (Windows XP, and yes they had to be running as an administrator to get their work done). For the curious it was a pdf. This turned out to be Conficker, which in turn ate every vulnerable computer on the entire network in about 15 minutes flat, and a fun night was had by all.

After we had sent the initial conficker infection shrieking back into the void from which it arose, we ran around to every single computer on the network and checked them one at a time. We remoted into each one in turn, ticked them off against both our IT internal list of systems, DHCP and even a Languard scan. After a few hours of fighting this particular brushfire, we were satisfied the network was clean and went home. By the time we arrived the next day we were on to the next problem, and the infection was almost completely forgotten.

This is where I made a big mistake.

The systems I discovered as infected today were, at the time we started cleaning the network, simply turned off. At the end of every work day, when the staff who use that equipment are done with it, they shut it down. They must have been active when the initial infection took place, and were turned off by staff member leaving for the night after we booted everyone off the network.

What’s worse, I completely forgot that those systems had Windows computers in them. They were, as computers integrated into larger pieces of equipment, out of sight and thus out of mind. (Let that be a lesson to you all: computers are integrated into everything these days. Think really, really hard about what’s on your network before declaring it bug-free.)

Knowing how these systems got infected, let’s delve into how I could have prevented this from occurring. The first and most obvious problem is that of patch management. I have a Windows Server Update Services (WSUS) server on my network to distribute patches, and I am very fastidious about testing patches against existing software and releasing the updates as soon as possible.

For those who don’t know about it, WSUS is an absolutely wonderful utility offered up for free by Microsoft that essentially serves as your own private patch server. Properly configured via GPO, your WSUS server is the only system on your network that needs grab Microsoft updates. You then release patches to computers on your network, which will download these patches from your WSUS server. This can not only help you save on bandwidth, but more importantly can give you time to vet patches for compatibility with existing applications before allowing them onto your systems.

Of course WSUS doesn’t help you if the patches never get installed. Certain computers (those belonging to management for example) are set up not to automatically install patches when released by WSUS. Rather, these systems are configured to simply download them and inform the user that updates are available. If the user chooses to ignore the little yellow shield in the system tray, then those patches simply never get installed. Evidently for periods that stretch into months at a time.

I would like to blame the users for this; how easy it would be to simply exclaim, “they should be patching their computers!” This is merely the easy way out - users are only half the problem to patch management. Part of the blame lies on my shoulders for not being more of a pest on the topic, and more bluntly for not checking who the laggards in patching their computer were. The facility for doing so is built right into WSUS and I simply never used it.

Had these systems been patched, then the initial point of infection, arriving via email vector, would still have occurred. It just wouldn’t have gotten very far because most everything else would have been patched. I would have saved myself a night of fighting Conficker across the network, but even this preventative measure would not have prevented the infection of the unpatchable Windows 2000 systems that are the topic of this article.

This is where the “misconfiguration” comes into play. There are a few very simple configuration items that in retrospect would have made a world of difference.

Misconfiguration no-no number one was that the systems in question were on the same subnet as the rest of my network. These machines have to talk to a very small group of other computers: a file server, a domain controller and a controlling workstation. These systems, being out of date and clearly vulnerable should have been as isolated from the rest of the network as possible. In my configuration this particular defence is possible, and the only real reason it was never implemented is convenience. Isolating defenceless systems in their own subnet isn’t possible for everyone, and this leads into the second major misconfiguration.

These known defenceless systems were not only on the same subnet as the main network, they were using DHCP. DHCP hands out a default gateway to anything that asks, and a default gateway gets you to the internet.

My big stonking pieces of equipment with the outdated unpatchable operating system have absolutely no reason to ever be able to access the internet. I should by all rights have either set these systems up with DHCP reservations such that they did not receive default gateways, or set them up as static IPs.

A separate DNS server consisting of only the minimum required information for these defenceless systems to operate would also have been a good plan. I could point these systems at the 'walled garden' DNS servers, further limiting their exposure to the outside world.

These older systems should also have had a firewall installed. Windows 2000 doesn’t ship with one of its own, but small lightweight firewalls are cheap and don’t consume much in the way of resources - far less than installing an anti-malware scanner. Even on a 'trusted' network like your home or corporate one it’s just not a good idea to be running without firewalls enabled on the local systems any more. (Get used to it folks; IPV6 doesn’t support NAT...) Certainly in the case of systems that can’t be patched, firewalls are absolutely necessary. A simple firewall configured only to allow SMB from the one computer that actually had a good reason to use that protocol would have prevented this infection.

What saved me was my edge device configuration - specifically the integration of the malwaredomains.com blocklist into my edge firewall. I had previously added the Domains.txt main blocklist as well as the DNS-BH list of 90,000+ conficker domains to my blocklist. This meant that although Conficker had found its way onto my network, it was effectively neutered. It could try to infect other systems on the LAN, but couldn’t get out to the wider web to download its payloads, or send out spam. For those unfamiliar with malwaredomains.com, I can’t recommend their blocklist enough. The blocklist is offered as either a BIND zone file or a Domains.txt file. There are importers for Domains.txt that will ensure you can use it with anything from ISA server to Smoothwall.

So there we have it: bad patch management and misconfiguration resulted in systems on my network becoming infected. A little bit of foresight neutered the malware once it was in place, but that doesn’t excuse the lapses that let these systems become infected. In my next article I will discuss policies, procedures and tools that will allow us to detect infected computers in near real-time. I will also address patch management at an application level as well as common configuration errors that lead to both desktops and servers becoming compromised. ®