ebook img

Building and deploying Billy Goat, a Worm-Detection System PDF

12 Pages·2006·0.23 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Building and deploying Billy Goat, a Worm-Detection System

Building and deploying Billy Goat, a Worm-Detection System James Riordan, Diego Zamboni, Yann Duponchel IBM Zurich Research Laboratory {rij,dza,ydu}@zurich.ibm.com May 8, 2006 Abstract forevidenceofworminfections. Thefocusofthispa- per is on the technical mechanisms and constructs Billy Goat is a worm detection system widely de- that we have used in building Billy Goat, to allow ployed throughout IBM and several other corporate it to operate reliably in very large deployments and networks. We describe the tools and constructions processing extremely large volumes of data. that we have used in the implementation and de- The rest of the paper is structured as follows: ployments of the system, and discuss contributions Section 2 provides an overall description of worm- which could be useful in the implementation of other detection systems in general. Section 3 presents the similar systems. We also discuss the features and re- architecture of Billy Goat, both as a sensor and as quirementsofwormdetectionsystemsingeneral,and a distributed system. Section 4 talks about related how they are addressed by Billy Goat, allowing it to work. Section 5 describes the data models used perform reliably in terms of scalability, accuracy, re- throughout Billy Goat. In Section 6 we describe silienceandrapidityindetectionandidentificationof some of the components we have implemented, and worms without false positives. decisions we have taken while building Billy Goat, including constructs that can be used in other con- Keywords: intrusion detection, Internet worms, texts. Sections7and8describe some ofourobserva- information security, honeypot. tionsandexperiencesduringBillyGoatdeployments. Section9concludesthepaper,andpresentssomeav- 1 Introduction enues for future work. Recent years have brought a continued increase in 2 Worm Detection Systems boththeimportanceofsecurityinnetworkedsystems andthedifficultyofsecuringthem. Beyondthebasic need for integrity, confidentiality and privacy, secu- The requirements on a worm-detection system rityhasbecomeessentialtowardprovidingreliability, (WDS) are different from those of a general-purpose safety, and freedom from liability. network-based intrusion-detection system (NIDS). One of the greatest threats to security has come While the latter needs to detect a wide and unpre- from automatic self-propagating attacks, including dictable variety of attacks, the former can focus on viruses and worms. The presence of these attacks specific attacks and strategies used by worms. The is not new, but the damage that they are able to in- mainpurposeofaWDSistodetectinfectedmachines flict and the speed with which they can propagate in the network, whereas a general-purpose IDS must have become paramount. Increases in connectivity detect the attacks themselves. These requirements andcomplexityonlythreatentoexacerbatetheirvir- influence the desired characteristics of the system, ulence. particularly in the following aspects: Thispaperdescribesawormdetectionsystemthat Accuracy: Toofferrealutility,aWDSmustbeable we have built and deployed, throughout IBM and in to identify worm-infected machines with a high level several customer networks, over the past five years. of accuracy, so that it can be trusted as the basis for The deployment within IBM covers the entirety of contention and remediation action. A WDS can use the corporate intranet, automatically gathering data highly-specializedtechniquestodetectworm-infected from approximately 1.2 million virtual sensors, cen- machines. Thisenablesincreasedaccuracy,attheex- tralizing the data to form a single coherent model of penseoftheabilitytodetectawiderrangeofattacks. suspiciousnetworkactivity,andanalyzingthismodel Speed: Given the explosive nature of modern 1 worms [8], a WDS should be able to detect an in- addresses. By doing so, they find most of the ma- fected machine as quickly as possible, to provide its chines in a network, but they also try to connect usersachancetocontainthedamage,oreventofunc- to a large number of unused addresses. Billy Goat tion as the basis for an automated response system. functions by responding to requests sent to unused Manageability: Atasystemslevel,installationand addresses, feigning the existence of a large number update of the components must be automated as of machines and services. This approach has three much as possible, to be able to deal with monitor- immediate consequences: ing very large networks. At middleware and archi- tecture levels, the base infrastructure must offer suf- • Active feigning of services, rather than the mere ficient flexibility to enable the rapid creation of new recordingofconnectionattemptsenablesgreatly detection capabilities. improvedunderstandingofthenatureofthecon- Interoperability: AWDSshouldintegrateasmuch nection, and gives Billy Goat an advantage in as possible with the existing tools and processes, accuracy. to avoid unnecessary proliferation of independently- managed security tools. • Thefactthattheaddressesareotherwiseunused Resilience: A WDS must operate under extreme andnotadvertisedmeansthatalltrafficdestined conditions in terms of network and processing load, to them is a priori suspicious. particularly during worm outbreaks. These condi- tions are more likely to induce failures than other • The large number of addresses used gives Billy environments. However, a WDS has a specific ad- Goat an extensive view of the network. vantage that is not enjoyed by most other IDSs: be- Insteadofdirectly“guardingthevaluables,”astradi- cause of the repetitive nature of worm activity, the tionalintrusiondetectiondeploymentsdo,BillyGoat WDS can afford to lose some data without reducing guards vast ranges of “nothingness” toward under- itsutility. Inpractice, thismeansitissatisfactoryto standing who goes there and why. This is similar to buildasystemthatcan“forcefully”recoverfromfail- a honeypot, although it lacks the eponymous honey ure (forexample, byautomaticallyrebootingoreven andmayhavedifferentscopeandgoalsdependingon reinstalling itself) rather than trying to resist it. the definition of honeypot used (see also Sec. 4). Graceful degradation: While WDSs may benefit Thisapproach,permittedbytheclearfocusonde- fromadistributedarchitecture,mostwormoutbreaks tecting worms, coupled with the analysis performed have the effect of overloading network links. It is on the data (Sec. 6.8) frees Billy Goat from the therefore necessary for all sensors to be able to op- highrateoffalsepositivesproducedbymostgeneral- erate on their own, so that even if the global system purposeIDSs. Forthesamereason,itisnotareplace- is impeded, its individual sensors can still be useful ment for other IDSs but a complement to them. In during a worm outbreak. particular, Billy Goat will not see the traffic directed to existing machines and services, so it is unable to 3 Billy Goat architecture detect attacks against them. AtthecoreofBillyGoatareavirtualizationmech- Billy Goat possesses the characteristics described anism and a data repository, shown in Figure 1. The above toward detection of network-service worms; virtualization mechanism (Sec. 6.5) permits services that is, worms that exploit vulnerabilities in existing to be written using standard programming models, network services (e.g. HTTP, MS/RPC, MS/SQL, and respond to multiple IP addresses transparently. etc.) and that propagate by finding new vulnerable This reduces the difficulty of creating new feigning machinesinthenetworkandinfectingthemautomat- services and of integrating existing ones. The data ically. Henceforth,weshallrefertotheseas“worms”. repository (Sec. 6.2) provides storage for all the IP- Billy Goat is focused on detecting machines in the and application-level information generated. network infected with known worms, and in this re- ThefeignedservicesofferedbyBillyGoat(Sec.6.4) spect it is free of false positives by construction. It include those commonly exploited by worms. Each alsoprovidesadditionalinformationthatcanbeana- endeavors to offer sufficient functionality to accu- lyzed to detect new worms or other emerging threats rately determine the nature of an attack. All the in the network. feigning servers except for SMB (Windows file shar- Billy Goat is designed to take advantage of the ing) are implemented using a specialized framework propagation strategies of worms. To discover ma- written in Java (Sec. 6.3) that makes it easy to cre- chines to infect, most worms try to connect to IP ate new services, and which is carefully audited for addresses selected at random or scan entire ranges of security. 2 uNpeBdwia lsltyieg Gnsaeoturavrteesr Auto−update Billy Goat sensor intWerefabce AdmLinoicsatrlator New functionality Raw data Alarms System updates intWerefabce Central Billy Goat Billy Goat sensor intWerefabce Local Centralized Administrator Alarms data Raw data Alarms Reports Billy Goat sensor Web interface Local Administrator Existing processes and tools Raw data Alarms (monitoring consoles, reporting mechanisms, etc.) Policy−based subscription Local Administrator Figure2: DistributedBillyGoatarchitecture.. net Service Providers observe and keep statistics on Virtualization Analysis connection attempts to unused IP addresses within HTTP Modular their networks. We do not know of any that actually alaarnmding SMB/Lure respond to this traffic. reporting MS/SQL A similar approach is that of “network tele- MS/RPC scopes” [7], which also record traffic sent to large blocks of unused IP addresses, but without respond- Backdoors ing to the traffic, and just recording statistics about IP profiling Data those connection attempts. repository (open ended) Billy Goat fits the definition of a low-interaction honeypot [19] in that it pretends to be something that it is not, but without emulating the system in Figure1: BillyGoatinternalarchitecture. depth. Projects like HoneyNet [18] have used the concept ofdeployingfakevirtualnetworks. Atthetimewhen To satisfy the requirement of continued function Billy Goat began, their focus seemed to be more on in times of heavy worm activity, when the perfor- accuratelysimulatingthosevirtualnetworks,totrick mance of the network may be dramatically dimin- humanattackers. Honeyd[13]isatoolkitforbuilding ished, WDSs require distributed architectures. Each these virtual networks. While these tools are useful Billy Goat offers the ability to analyze and report for engaging human attackers, such sophistication is eventsdetectedlocally,thusprovidinggracefuldegra- notneededtotrickworms,whicharecommonlymuch dationofthedetectionservice. Atthesametimethe dumber. dataofallBillyGoatsonanintranetiscentralizedto On the other hand, Billy Goat has focused from assembleamorecompleteview(Figure2). Oneeffect the start on detecting automated attacks performed of this type of monitoring is the feasibility of detect- byworms, andforthisreasonBilly Goatdoes notgo ing infected machines on network segments that do to great lengths to hide what it is. This allows cer- not have a Billy Goat sensor installed. tainfoundationalconstructionsresultinginasimpler, We refer to a machine with Billy Goat installed more scalable architecture, and in the ability to have on it as a “sensor” or “Billy Goat” and to the pro- a “plug-and-play” implementation, which comes pre- gramsthatimplementtheindividualfeignedservices configured to observe a network in a useful fashion, as “feigning servers”. immediately after installation. Over time, in an example of convergent evolution, 4 Related work honeynets and honeypots have also been used to de- tect worms [6, 10, 14] and Billy Goat has grown in The idea of observing traffic to unused IP addresses deceptive sophistication. has been used for detection of network noise and The Internet Motion Sensor [2] consists of a dis- scanning behavior. For example, most large Inter- tributed network of dark address spaces, with the 3 objective of detecting and measuring threats. The 5.1 IP Layer IMS uses generic listeners to capture and respond to Data in the IP layer covers TCP, UDP, and ICMP connections, which results in less specialized sensors, headers. Itdescribeswhichsourceaddresseshaveini- with the advantage of making it easier to deploy lis- tiated contact with which destination address, along teners for new services. with details that vary with each particular protocol. Pang et al. [12] have implemented a system for We gather this data directly from the Linux kernel measuring and characterizing network noise, using IPtables facility [4]. unused IP address space and protocol-specific listen- ers. This system uses a non-distributed architecture, 5.2 Application layer anditsobjectiveisdifferentfromBillyGoat(charac- terizing the traffic seen, instead of precise identifica- Feigning servers gather data in the application layer. tion of infected machines). We split this data into three aspects: the first de- TheWorminatorproject[23]usesadistributedar- scribes basic source and time information. The sec- chitecture to correlate alarm streams from multiple ond two aspects address the application-level infor- sites to detect widespread suspicious behavior. It mation. One contains data that, as best we are does not employ unused IP address ranges like Billy able to determine, is inherent to the worm (constant Goat and other systems. The HoneyStat project [3] through all instances of it) while the other aspect useshigh-interactionhoneypots(fulltargetoperating describes the data that varies between different in- systems, running on an emulator) to collect informa- stances of the worm (in many cases the latter aspect tion, resulting in more detailed data, at the cost of is empty). far more extensive resource requirements. Instead of Thedistinctionbetweenconstantandvariabledata using an IP-address virtualization scheme, it utilizes is not uniformly well-defined across all worms. For the multi-homing capabilities of the target operating example, some account-guessing worms have a con- systems, resulting in a lower maximum number of stant list of account names they use, while others emulated IP addresses. modifythelistbasedonaccountsontheinfectedma- Another interesting project is the one reported chines. Feigning servers often have a certain amount by Yegneswaran et al. [25]. This project observes of intrusion detection logic to make this distinction. traffic to unused IP addresses using a combination Thus, a feigning server subject to account guessing of passive monitors, active responders, and virtual may have a list of standard account names attacked. networks and hosts. The report describes the use of If an attacked account name is in the list it is con- samplingtechniquestodealwiththelargeamountof sidered “constant,” if not it is considered “variable.” traffic while still being able to draw meaningful con- This allows us to focus analysis on “constant” while clusions. Unlike Billy Goat it uses a non-distributed still offering the possibility to analyze “variable” for architecture, with the consequent functional and im- emerging behavior (such as new worm variants). plementation differences. 5.3 Analysis model 5 Data Models The basic data model used for analysis is an aggre- gation, for each source address, of the above data At the heart of Billy Goat are several data models models over a specifiable time frame. It answers the used to represent the data gathered in both the IP question “what have we seen from this address re- and applications layers. In this section we describe cently?” Each per-source model is comprised of the the types of information that are collected, and the source IP address; the time period covered by the data structure constructed for analysis purposes. datainthemodel; theIPaddressesoftheBillyGoat The data models we use address the requirements sensorsthathavereportedtheactivity;andanaggre- of a WDS. In particular, we need to address the effi- gationoftheIP-andapplication-leveldatacollected, cienthandlingofhighvolumesofrepetitivedata; the including full worm capture when possible. distributednatureofthesensorsandtheneedforcen- The details of how analysis is performed are cov- tralizing the information they produce; the need to ered in Section 6.8. analyzethedatacollectedbythesensorsforevidence of worm infections; the ability to add new feigned services; and the ability to capture varying levels of 6 Implementation complexity depending on the data collected. All data is stored in a relational database, whose This section describes some of the components we structure is described in Section 6.2. have designed and implemented for Billy Goat. We 4 IPTABLES ACTIVITY REQUEST mapping to the three covered protocols (TCP, UDP, TIME TIME REQUEST_INDEX and ICMP). FLAGS are void for UDP and ICMP, TIME_OFFSET TIME_OFFSET SENSOR and the type of ICMP message is stored in both the REPORTER REPORTER REQUEST SPTandDPTfields. Storingthethreetypesoftraffic PROTOCOL SRC SEQID SRC SPT in a single database table makes it easier to extract DST REQUEST_INDEX HOST all the information with a single SQL query. SPT SENSOR SENSOR Thefulldescriptionsoftheapplicationlayeractiv- DPT HOST_INDEX HOST_INDEX ity,REQUESTandHOST,areexpressedinXMLand FLAGS SEQID HOST SEQID SEQID correspond to the previously described constant and variabledataaspects. Thisallowsustomeaningfully Figure3: DatabasetablesusedinBillyGoat. encodeinformationgatheredintheapplicationlayer. Forexample,asimpleUDPlistenerprovidesagreatly different data model than an extremely complicated have endeavored to use standard tools, formats, and protocol like SMB. XML allows us to keep simple APIs, and to provide simple, well-documented inter- descriptions simple while still allowing for complex faces by which Billy Goat may be integrated with descriptions. existingtoolsandprocess. Wehaveusedmanyopen- Rather than the standard practice of using na- source components throughout Billy Goat— in fact tive database external keys for references, we have its construction would not be possible without them. introduced the use of cryptographic checksums: REQUEST INDEX = md5(REQUEST) 6.1 Implementation platform HOST INDEX = md5(HOST) This offers the significant advantage that references BillyGoatisimplementedasaspecializedLinuxdis- dependonlyonthecontentofthedatabaserecordto tribution,whichself-installsonastandardPCrequir- which they refer. Our experience deploying a large ing only basic configuration information. It was our distributed system has shown this technique greatly intentiontomakeBillyGoatasappliance-likeaspos- eases data centralization. sible,sothatitcanbedeployedwithminimumeffort REPORTER is the IP address of the Billy Goat throughout a large network. Billy Goat includes ex- sensor that observed the event. SEQID is an auto- tensiveself-monitoringandrecoverymechanismsthat maticallyincrementedvaluethatweusetokeeptrack monitorsystemactivityandcorrectorreinitializeer- ofwhicheventshavebeenprocessedbydifferentcom- rantcomponents, includingthesystemitself(e.g. re- ponents. SENSOR is a short string identifying the boot). feigning server that produced the record. Tosupportthedistributedarchitecture,BillyGoat includes an automatic update mechanism. This en- 6.3 Event framework sures that each sensor is always current with respect to both signatures and software versions, and makes An extensible light-weight Java event framework is iteasiertomanagealargedistributedinfrastructure. used ubiquitously in the construction of Billy Goat: data acquisition, centralization, analysis, alarming 6.2 Database Tables and reporting. The basic components of the framework provide We split the data collected by Billy Goat (Sec. 5) support for buffering, fanning out, predicate expres- intofourdatabasetablestoaccommodatetherepeti- sionandevaluation,serializationandde-serialization, tive,andoftenverbose,natureoftheattacksusedby variousnetworktransports,connectionstodatabases, worms. and several other generically-useful components for These tables have the form shown in Figure 3, event handling. where solid lines indicate external keys (references Events are objects which have the structure of across tables) and dashed lines indicate temporal trees, whose nodes are named and have attributes, proximity (the times are generated in different lay- and whose leaves are arbitrary Java objects. This ers and hence may have slightly different values). structure is motivatedbyXMLand, indeed, theseri- TIME (an SQL timestamp) together with alized form of the objects is valid XML. TIME OFFSET (an unsigned integer that indicates Using the event framework for transport makes it the nth event in a given TIME) create a unique easy to inter-operate with existing tools and pro- timestamp for each event. cesses. As examples, we have implemented compo- PROTOCOL,SRC,DST,SPT(sourceport),DPT nents for sending events through email, syslog, the (destination port) and FLAGS apply to the IP layer, Tivoli Risk Manager console [22] and the Arcsight 5 console [1]. Switching to a different event transport mechanism is as easy as changing its definition in a External Network NAT Internal network configuration file. 6.4 Feigning servers Reverse Single NAT host As mentioned, the key observation mechanism of BillyGoatisacollectionoffeigningservers,eachcov- eringaninfectionvectorusedbywormstopropagate. Figure4: TraditionalandreverseNAT. Real servers may have vulnerabilities in different layers,andoftenthisrequiresustowritethefeigning Other feigning servers that currently exist in Billy servers in a way that can detect attacks in different Goat include MS/SQL, SMTP, DNS, LDAP over layers. For example, we may need to write the feign- SSL, eDonkey, Kerberos, and multiple worm back- ing server to be able to detect both low-level buffer doors, including those of MyDoom, Sasser, Dabber overflow vulnerabilities and application layer vulner- and some Beagle variants. abilities. In general, we try to build the servers to follow the corresponding protocol up to a point that accurate identification of the activity becomes possi- 6.5 Address virtualization ble. For example: Address virtualization transparently maps the large • TheHTTPfeigningserveracceptsandrecordsa ranges of IP adddresses covered by a Billy Goat to single HTTP request, and always responds with the single “real” address used by the machine. This an error, before closing the connection. virtualizationallowstheBillyGoatfeigningserversto bewrittenwithnospecialconsiderationforthelarge • The MS/RPC feigning server accepts and number of IP addresses that a Billy Goat machine records the first 3000 bytes (configurable) trans- monitors. Address virtualization is handled by the mitted by the client, before closing the connec- operating system using the IPtables mechanism [4]. tion. This initial payload generally contains ei- OneofthemechanismsbuiltintoIPtablesisNetwork ther the full code of the worm or an exploit par- Address Translation (NAT). Ordinarily, NAT is used ticular to the worm. to allow several machines inside a network to share a single external address. For Billy Goat, we do the • The SMB/Lure server is a special configuration reverse: allow a single machine to respond to a large of Samba [17] that appears to be a badly con- number of external addresses (Figure 4). figured machine (open shares, weak passwords, etc.). Because it is a full implementation of the protocol, SMB/Lure can often capture the full 6.6 Centralized configuration code of the worm, as it uploads itself. To maintain local configuration information while The majority of servers are written in Java and maintaining homogeneity across machines, we use a produce descriptions of individual interactions. The centralper-hostconfigurationfileinXMLfromwhich specific syntax of the records produced is left to each systemconfigurationfilesarederived. Thefileiscre- individual server. ated by a wizard during first-time installation, and Some protocols are sufficiently complex so that it canbelatermodifiedbyhandorbyspecializedtools. isnotfeasible,withinthescopeofourproject,toim- Having all the information in a single file makes it plement feigning versions of them. The most notable easy to “back up” an entire sensor. example is Microsoft’s SMB protocol. In this case, Arelatedissueismaintainingtheconfigurationfor we use an open source implementation of the SMB all the distributed sensors in a central place. This protocol [17] specially configured to mimic the vul- enables restoration of a sensor that is completely de- nerabilities of the native Windows implementations. stroyed (for example, by a catastrophic disk failure), We base our solution on the SMB/Lure system [9], by restoring its configuration to a new machine. It with some ideas from WormCharmer [11], but reim- also becomes possible to centrally control the config- plemented it to address the needs of large-scale vir- uration of machines, similarly to network configura- tualization and to fit with the rest of the Billy Goat tionschemessuchasBOOTPandDHCP.Basedona architecture. This is an example of how existing ser- unique identifier, the central server can provide each vices may be integrated into Billy Goat thanks to its sensor with its configuration information. virtualization infrastructure. This doubly-centralized configuration keeps all the 6 sensors as homogeneous as possible, while allowing Whenprecisewormidentificationispossible(when for per-sensor configuration in an automated and we can give a name to the worm), the findings are manageable fashion. labeled as “alarm” and the worm name is given. Clearlysuspiciousbutunidentifiablefindings(forex- ample,alargehorizontalorverticalscanoranexploit 6.7 Data centralization mechanisms set that does not fully identify a worm) are labeled Fordatacentralization,wedevelopedagenericmech- as “warning” and a description is given. All other anism for one-way transfer of relational tables. This data is labeled “unknown” and is available via direct mechanismisbuiltatoptheBEEPLite[5]implemen- query of the database. tation of BEEP [16], a modular, extensible protocol Billy Goat is false-positive–free by construction, supportingflexibleestablishmentandmultiplexingof because“alarms”areonlyproducedwhenawormcan communication channels. It is designed to tolerate be unambiguously identified. “Warning” and “un- intermittent sensor, server or network failures. Data known” events are also produced, but should not be isautomaticallytransmittedorretransmittedasnec- considered as unequivocal evidence of infection. essary, without loss or duplication. The mechanism makes use of the checksum-based external keys and 6.8.1 Alarm redistribution of the monotonically-increasing Sequence-ID field, as We have found that, in large organizations, utiliza- described in Section 6.2. tion of existing social structures is essential toward creating mechanisms of sufficient dynamism to ad- 6.8 Data analysis dress rapidly-emerging security problems. By con- trast, static lists tend to become outdated quickly. The data analysis is an iterative process that at- We have built a subscriptionservice forBillyGoat tempts to determine the types of activity that have thatutilizesthesestructures. Itproducesalertsbased been seen by Billy Goat from different source ad- onthecentralizedBillyGoatdatatoprovidethemost dresses in the network. This process is most com- complete coverage. It allows individual systems ad- plete when done in the central server to which all ministratorstoself-registertoreceivealertspertinent Billy Goat machines send their data, because it aids to their own network ranges, thereby ensuring that indiscoveryofglobalbehaviorthatmaynotbevisible alerts are delivered to someone who can actually do at the individual sensors. something to fix the detected problems. Open reg- Identification of known worms, attacks, and be- istration allows for a “living” mapping between net- haviors, is done using a combination of the following work and owner. Access control is done via social methods: mechanisms, by notifying the subscriber’s manager on registration. • Capture of the worm itself (for example, SMB wormsthatuploadthemselvestoBillyGoat). In this case, the MD5 checksum of its code is used 6.9 Modes of deployment to identify the worm with 100% accuracy. As wormsbecomemoresophisticated,weanticipate The fundamental premise of Billy Goat is respond- the utilization of analysis techniques from the ing to traffic directed to unused IP addresses, as de- anti-virus world [21]. scribedinSection2. Differentdeploymentmodescan be used and combined to direct such traffic to Billy • Observation of the exploits used by the worm Goat. (for example, an HTTP request containing a buffer overflow). Because Billy Goat is a first- 6.9.1 Static routes personobserver,itcanaccuratelycollectthefull set of exploits used by a worm, and match them This is the standard Billy Goat deployment mode. with known worms (for example, we know the A specific set of unused IP address ranges is desig- full set of exploits used by the Nimda worm). natedforBillyGoat, andtheappropriateroutersare reconfigured to send traffic destined to those ranges • Observation of other behaviors indicative of to a Billy Goat sensor. The amount of traffic seen worm activity (for example, horizontal scanning by the sensor depends on the size of the network or account guessing). These are weaker indi- range assigned. Addresses within non-routable ad- cators of worm activity in the sense that they dress ranges [15] that are not used locally may also do not make it possible to precisely identify the be routed to the Billy Goat, thereby expanding its worm. view of the network. 7 Advantages: a known set of IP addresses is as- Local Infected Destination machine Local router router signed to Billy Goat, which helps in controlling the Initial packet amount of traffic it has to process. Only simple con- Intercept Internet and reroute ICMP figuration changes need to be made to the routers. "host unreachable" Disadvantages: large-enough groups of network Further traffic addresses must be available and assigned by the net- work administrator. If the assigned range is too small, the functionality of Billy Goat is limited be- Billy Goat sensor cause it cannot observe much of the network traffic. Figure5: ICMP-basedBillyGoat. 6.9.2 ARP spoofing In a LAN, the machine that has a particular IP ad- Disadvantages: potentially dangerous, particu- dress is found by using ARP (Address Resolution larlyinconjunctionwithdynamicrouting. Inalarge Protocol). Using this protocol, machines and routers network,itiscommonthatcertainnetworksegments in the local network that need to send traffic to an go offline for short periods of time. If Billy Goat address X broadcast the question “who has address automaticallystartsrespondingforthem,itmaydis- X?” and wait for a response. If no response is re- turbservicesorautomatedmonitoringsystemsinthe ceived in a certain period of time, the address is con- network. sidered nonexistent. ARPisvulnerabletoARPspoofing [24],bywhicha 6.9.4 ICMP-based Billy Goat malicious host can “hijack” IP addresses by spoofing ARP responses. This same technique can be used by Oneofourmostrecentarchitecturaldesignsisamode a Billy Goat device to automatically grab currently of deployment in which Billy Goat operates in con- unused IP addresses. junction with a router to provide automatic utiliza- Advantages: no previous assignment of IP ad- tionofalltheunusedaddressesoutsidethelocalnet- dresses is needed, so the deployment effort is very work. This is how it works (Figure 5): low (simply connect the Billy Goat machine to the 1. When an infected machine in the local network network). triestocontactaremotenon-existingaddress,an Disadvantages: ARPspoofingispotentiallyvery ICMP “network unreachable” or “host unreach- dangerous if Billy Goat attempts to spoof the IP ad- able” message will be generally sent back. dress of an existing device. The implementation of this scheme needs to take into account the potential 2. The ICMP message is intercepted by the router appearance of devices (spoofing must stop immedi- local to the infected machine, which sets up a atelywhenanotherdevicewiththe same address ap- temporary route for that destination address, pears on the network). with the Billy Goat sensor as its next hop. 6.9.3 Billy Goat as default LAN route 3. When the infected machine, after not receiving aresponse, retransmitsitspacket, itwillbesent Instead of having specific network ranges assigned to to the Billy Goat sensor, which will respond to the Billy Goat sensor, the router can be configured it. with a route to Billy Goat for the entire LAN and higher-priority (mask length) routes for the ranges ManymodernwormsimplementtheirownTCP/IP that are being used. This gives Billy Goat all traffic stack, often without the automatic retransmit fea- for LAN segments that are not in use. tures. By keeping additional state in the router, the This scheme can be implemented statically (when retransmit can be spoofed so that those worms are therouterhasastaticroutingtable,andtherouteto also caught. the Billy Goat sensor is added as the default route) Advantages: this mode of operation automati- ordynamicallyinconjunctionwitharoutingprotocol cally spoofs every unused address outside the LAN. such as BGP. This provides Billy Goat with a trulyexpansive view Advantages: large network coverage, and ease of and allows it to quickly identify local infected ma- configuration (the Billy Goat sensor can be config- chines. ured to “spoof everything,” and it will respond to Disadvantages: router support is needed to im- any traffic it receives). plement this scheme. 8 A NIDS at position 2 would see an increase in the LAN number of Nimda-related alarms of attacks originat- NIDS 2 NIDS 1 ing from machines on the LAN (those against the Actual Intranet and those against Billy Goat). A NIDS at Intranet position 3 would see attacks from both the LAN and Router theIntranetandwouldhavegreatlyincreasedfidelity stemmingfromthefactthatitdoesnotseeanylegit- Billy Goat imate traffic. We are actively exploring the benefits NIDS 3 oftheseeffectsbothonfidelityandoncostreduction owing to the expanded view that a Billy Goat offers Figure6: PossibleNIDSplacementswithrespecttoBillyGoat. a NIDS. 7 Environmental effects 7.3 Failure Modes We have observed an interesting failure mode in the Oneoftheareasthatwehavefoundmostinteresting default route and Router/ICMP modes of deploy- istheeffectofBillyGoatonothersystemsinthenet- ment. The problem occurs when a machine or net- workingenvironment. Wedescribesomeoftheeffects work goes down and Billy Goat automatically starts we have observed during our deployment experiences responding for it. In this case, liveness checking over several years. mechanisms, such as ICMP echo (ping), yield decep- tiveresults. Thisisespeciallyproblematicwhenthese 7.1 Network discovery checkingmechanismsareconnectedtoothersystems. In a testing phase, we accidentally induced the crash Thefirstaspectthatweencounteredwastheinterac- of a problem-ticketing system which got caught in tionofBillyGoatwithdevicesandsoftwarethatscan the cross-fire between two automated systems: one the network legitimately. Normally, Billy Goat re- asserting that the network was down, based on the spondstothesescansforeachoneoftheIPaddresses truth, and the other asserting that it was up, based it is spoofing, producing wildly inaccurate results for on the fact that the network seemed to be reachable. the scanner. We addressed this problem by adding We take this failure mode, induced by a relatively a mechanism that makes Billy Goat respond “truth- passive system, as an interesting finding to keep in fully”toaconfigurablesetoffixedIPaddresses. This mind when developing automatic intrusion response makes Billy Goat appear like a regular machine to systems. authorized scanning devices. 7.2 Network sensors 8 Findings The second area that we discovered was a sharp in- Our deployment of Billy Goat sensors, both in large creaseinthenumberofalarmscomingfromnetwork- corporatenetworksandontheInternet,haveafforded based IDSs (NIDS) that observe the stream of traffic us a view into a mass of data about worm-infected flowing to Billy Goat. This stems from the fact that machines that is rarely available elsewhere. In this Billy Goat, in allowing illegitimate connections to section we share some of our observations. complete, increases the number of real, albeit harm- less, attacks seen on the network. The size of the in- crease depends on the relative sizes of the networks, 8.1 Worm flare-ups the saturation level of the network connections, the rootcauseofthealarmandthesignaturesandplace- Confirming previous findings [20], we routinely see ment of the NIDS. flares of old worms, such as Code Red, Nimda and ThiseffectwasfirstnoticedwhentheNimdaworm Sapphire [8]. These reappearances are often brief, was active in a well-connected network whose Billy due to monitoring and blocking mechanisms that are Goat address space was approximately 100 times now aware of them. larger than the number of actual hosts, and with We observe that the frequency of such recurrences NIDS at position 1 (see Figure 6). The result was seems affected by the propagation vectors that each roughlyacorresponding100-foldincreaseinthenum- worm uses. For example, Sapphire uses MS-SQL, ber of Nimda-related alarms of attacks from the In- which is often turned on automatically by other pro- tranet against the LAN. grams (e.g. MS Visio 2000 and MS Project), so it 9 it is routed to real machines), in a manner similar to the numerous independent instances of private ad- External dress ranges [15] (e.g. 10.0.0.0/8). Beyond the ex- fake network pected worm traffic observed on the Internet, this Internet Billy Goat Internal network setup has offered us a view of some interesting be- NAT gateway haviors, including the following: • We often see “ghosts” of internal network struc- tures,suchasmachinesontheInternetattempt- Figure7: Double-routingschemeforexternaldeployment. ing to connect to internal email servers. This is explained by application caching of DNS in- is likely for previously deployed machines to become formation on machines that change networks ei- vulnerable. ther physically (laptops brought home and con- nectedtotheInternet)orvirtually(disconnected VPNs). 8.2 Noise • We have also observed unexplained traffic to in- Billy Goat has been deployed in IBM and customer ternaladdresses. Forexample,responsestopeer- networks. In addition to identifying worm traffic, we to-peertraffic,directedtoaddressesthatarenot have observed behaviors that vary from network to assigned (even internally). One possible expla- network. In some networks we have been able to nation for this behavior is that someone is using track the causes of these behaviors to misconfigured an internal address range on a separate private devices,poorlywrittensoftware,orotherexplainable network, behind a NAT gateway. If some ma- activity. Inthesecases,thedetectionofnoiseisause- chine in this private network is identifying itself fulsideeffectofaBillyGoatdeployment,whichcon- withitsinternaladdress,itspeersontheInternet tributes to general network management and health wouldbeattemptingtorespondtothataddress, monitoring. whichwaspreviouslynotrouted, butwhichnow Some notable examples of explainable phenomena goes to the external Billy Goat sensor. This are: raises questions about the accuracy of identify- • Machines with an incorrect DNS server configu- ing and tracing peer-to-peer users through par- ration, which happens to be in a spoofed range. ticipation in the peer-to-peer protocols. We often observe repeated DNS requests to a specific address, from a single address. 9 Conclusions and future work • Broken software implementations that result in network scanning. For example, in one occasion We have built and widely deployed Billy Goat, we detected a Linux machine that was scanning a worm-detection system specialized in detecting the network on SMB ports. Upon examination, network-serviceworms. BillyGoathasbeendesigned itwasdiscoveredthatanSMBclientwasrunning to be scalable, to operate gracefully in a large dis- on it, and it was scanning the network trying to tributed environment, and to provide extremely ac- find suitable servers, instead of using broadcast curate detection of worm-infected machines. packets. BeyondthebasedescriptionofBillyGoat,thecon- tributions of this paper include: 8.3 Internet deployments • deployment modes for honeypots, in particular We have installed a number of Billy Goat sensors on the new ICMP-based mode; the Internet for purposes of stress-testing, debugging • environmentaleffectsofBillyGoatdeployements and data capture. Some of these sensors have been including the effect upon NIDS; configured to mirror address spaces used in internal • potential failures of other systems induced by corporate networks that are connected to the Inter- Billy Goat; net via NAT (hence, traffic to or from these address spaces should normally not be seen on the Internet). • the double routing of an IP address space on Such a mirrored deployment is illustrated in Fig- theInternetandanintranetconnectedviaNAT, ure 7. It results in an address range that exists both leading to the observation of two varieties of on the Internet (where it is routed to the external “ghosts”: those manifested from internal ma- BillyGoatsensor)andontheinternalnetwork(where chines and those unexplained. 10

Description:
Defense and Detection Strategies against Internet Worms. Artech House Publish-ers, 2003. [11] M. Overton. Worm charming: taking SMB Lure to the next level.
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.