HoneyMonkey, short for Strider HoneyMonkey Exploit Detection System, is a Microsoft Research honeypot. The implementation uses a network of computers to crawl the World Wide Web searching for websites that use browser exploits to install malware on the HoneyMonkey computer. A snapshot of the memory, executables and registry of the honeypot computer is recorded before crawling a site. After visiting the site, the state of memory, executables, and registry is recorded and compared to the previous snapshot. The changes are analyzed to determine if the visited site installed any malware onto the client honeypot computer.[1] [2]
HoneyMonkey is based on the honeypot concept, with the difference that it actively seeks websites that try to exploit it. The term was coined by Microsoft Research in 2005. With honeymonkeys it is possible to find open security holes that are not yet publicly known but are being exploited by attackers.
A single HoneyMonkey is an automated program that tries to mimic the action of a user surfing the net. A series of HoneyMonkeys are run on virtual machines running Windows XP, at various levels of patching — some are fully patched, some fully vulnerable, and others in between these two extremes. The HoneyMonkey program records every read or write of the file system and registry, thus keeping a log of what data was collected by the web-site and what software was installed by it. Once the program leaves a site, this log is analyzed to determine if any malware has been loaded. In such cases, the log of actions is sent for further manual analysis to an external controller program, which logs the exploit data and restarts the virtual machine to allow it to crawl other sites starting in a known uninfected state.
Out of the 10 billion plus web pages, there are many legitimate sites that do not use exploit browser vulnerabilities, and to start crawling from most of these sites would be a waste of resources. An initial list was therefore manually created that listed sites known to use browser vulnerabilities to compromise visiting systems with malware. The HoneyMonkey system then follows links from exploit sites, as they had higher probability of leading to other exploit sites. The HoneyMonkey system also records how many links point to an exploit site thereby giving a statistical indication of how easily an exploit site is reached.
HoneyMonkey uses a black box system to detect exploits, i.e., it does not use a signature of browser exploits to detect exploits. A Monkey Program, a single instance of the HoneyMonkey project, launches Internet Explorer to visit a site. It also records all registry and file read or write operations. The monkey does not allow pop-ups, nor does it allow installation of software. Any read or write that happens out of Internet Explorer's temporary folder therefore must have used browser exploits. These are then analyzed by malware detection programs and then manually analyzed. The monkey program then restarts the virtual machine to crawl another site in a fresh state.