Friday, December 5, 2008

Rogue DHCP servers cause perceived service outages

Starting last week, we have been having some issues with groups of machines experiencing unexplained network connectivity outages. The pattern is typically that of one machine losing connectivity, and over the course of an hour or two, many more follow. Almost from the beginning, I had a gut feeling that something was affecting the machine's behavior that was external to them; the pattern did not point at a large-scale distribution of malware.

Late last week, I formulated the hypothesis that we might have some rogue DHCP servers popping up here and there.

Since the machines that were having the problems revert to a presumed-good image upon reboot, finding the actual problem was a little bit of a challenge. Our only real option was to wait for another outbreak, followed by a focused and temporary capture of network traffic.

A few days ago, we got a call that machines were experiencing browsing difficulties, and we were able to capture some network traffic that confirmed my hypothesis; a box that not one of our DHCP servers was sending out DHCP offers.

The offers convinced the clients that received them to switch their DNS to an external machine. We block outbound DNS to anything but our own servers, so the result was that the infected machined were no longer able to resolve any host names. Not a good thing, since they are primarily used as web browsers (non-proxied).

As we were wrapping up the incident, segmenting our network even further, putting in additional  monitoring, and doing some additional hardening in certain areas, the Internet Storm Center posts a diary message that was timely and on topic. The subject of the message was Rogue DHCP servers.

Not even a half-day later, I was on the phone with my previous employer and 'lo and behold: they were suffering outages and seeing strange DHCP traffic. It seems that whatever site is offering this Trojan.Flush.M-malware is very effective at reaching institutes for higher education world-wide. 

Symantec ranks the risk of this malware as "risk level 1: very low". I disagree with this; risk is a function of probability and impact, and looking at my own experiences, both are high. I have first-hand knowledge of at least three institutes that have been hit, and the impact of getting hit (loss of the ability to resolve hostnames) is also large.

How do we defend against this? The most important steps are to ensure proper network segmentation and providing up-to-date anti-malware software on workstations. On servers, you probably want to statically configure your DNS settings. Make sure you can monitor what's happening on your network; block outbound DNS traffic to everything except your own servers, and consider deploying hardened proxy servers for browsing.



No comments:

Post a Comment

Please share your view and opinions on what I wrote. In order to maintain quality, all comments will be moderated for merit. Contributions that call me out on statements that appear unfounded, wrong, or simply with which you disagree are highly appreciated and are even encouraged. Spam and 'me too' answers will be ignored.