Botnets are a fascinating piece of software yes, they really are!
Although they lie deep in the ‘dark’ stack of widespread tools that are used to perpetrate cyber-crimes, they really shine as well-engineered structures. They are forced to constantly evolve because of the current and on-going ‘arms-race’ between security experts and cyber-criminals.
Think about it for a second. A botnet system must be highly resilient, highly available, highly distributed, and allow the coordination of a high number of nodes for criminal campaigns (e.g., spam waves, etc.), all this with networks of infiltrated computers that span the whole globe. That’s a lot of highs, indeed.
Let’s review some of the ‘high’ challenges and aspects of botnet evolution to give perspective on the huge efforts currently being made by security experts to study and compromise them.
Botnets are, by nature, very resilient and evasive. This implies that their structure is designed to prevent any single point of failure, hence the hierarchical topology/organization usually seen in such networks. Besides using increasing numbers of fallback nodes and multiplying the layer of indirections to help hide the core endpoint nodes (that host phishing content, for example), botnets put a lot of effort into trying to maintain a constant communication link between the bot agents (the infected machines) and the Command and Control nodes (C&C), which are the brains of such networks.
A recent investigation [1] that helped take down the Mega-d/Ozdok botnet revealed concrete examples of some of the new techniques being used to guarantee an always-open communication channel between the bot agent and its C&C core nodes. This channel is used to receive orders and updates, send back collected information, and so on, and represents the amount of effort put into providing its high-availability characteristics.
A recent report on the Torpig botnet [2] also discusses its communication techniques, called IP-flux and Domain-flux.
IP-flux aims at ensuring that a fully-qualified domain name does not resolve to a single IP but to a large number of IP addresses, used in a round-robin fashion with a short Time-To-Live value (for fast DNS resolution). As stated in The Honeynet Project [3]: ‘A browser connecting to the same website every 3 minutes would actually be connecting to a different infected computer each time.’ The same article also details strategies for detecting and circumventing networks that use the IP-flux method. IP-flux not only aids availability, it can also be a very powerful way to dynamically add load-balancing capabilities to your network of distributed nodes, for example. Through a careful inspection of specific node performance (thanks to the feedback capabilities of bot agents), some nodes can be added and removed from the pool of available IPs depending on their performance or on any kind of meaningful measure. Using that technique, any task or distributed algorithm that can update a dynamic pool of nodes on-the-fly is greatly facilitated.
To prevent a single domain name from becoming a single source of failure, and to enhance the resilience of botnets against take-down efforts, the IP-flux technique mutated into the ‘Domain-flux’ approach. The idea is basically the same, but instead of multiple IPs, multiple domain names are involved. Each bot agent uses a deterministic algorithm to create a list of domain names on-the-fly. The agent then tries to successively resolve each domain so that it can reach the C&C servers and/or even spread itself using the drive-by-download technique. When you see domain names like io7grec9merhpzga.org or cfe3cd.wvahajol.cn (mentioned in the FireEye Malware Intelligence Lab report [1]), that’s what it might be!
There is a weak point though. Since the generation mechanism used to create the list of domain names is pretty deterministic, we can guess what the future names will be ahead of time! Using that information, we can try to register some of them before they are actually called by the bot agent (the list of all possible domain names cannot possibly be completely registered by a botnet master, mostly because of economic reasons, so there will be holes). Once you have some names set up, the next steps are 1) wait for contact, 2) basically behave as if you are a C&C, and 3) ‘Voila!’ you have then infiltrated the botnet (or at least part of it).
One recent interesting update [4] is that the method used to generate such domain names has changed from a deterministic time-based algorithm to a more random one based on Twitter! The generation algorithm is now basically divided into two steps: most of the domain name is first created following the same pattern described above (i.e., in a pretty deterministic fashion), and then a few random characters are appended to it. These characters are based on the constantly changing Twitter subject trends. Twitter can be easily queried for the list of current trendy subjects; therefore, we have ready access to a pseudo-random stream of characters from which a few can be picked to complete our names. This basically prevents the kind of efforts described above since the ‘guessing the future domain names’ part of it then becomes a lot more complicated.
So, what should be our next step in the face of a stream of increasingly sophisticated malware techniques? As things get more complex, what can uncoordinated, local and specialized efforts do to stop this?
Further reading/references:
[1] ‘Smashing the Mega-d/Ozdok botnet in 24 hours’, FireEye Malware Intelligence Lab, 2009
[2] ‘Taking over the Torpig botnet’, The Computer Security Group at UCSB, 2009
[3] ‘Know Your Enemy: Fast-Flux Service Networks’, honeynet.org, 2007
[4] ‘Your Botnet is My Botnet: Analysis of a Botnet Takeover’, 2009
Leave a Comment