ebook img

IoTScanner: Detecting and Classifying Privacy Threats in IoT Neighborhoods PDF

2 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview IoTScanner: Detecting and Classifying Privacy Threats in IoT Neighborhoods

IoTScanner: Detecting and Classifying Privacy Threats in IoT Neighborhoods Sandra Siby Rajib Ranjan Maiti Nils Ole Tippenhauer sandra_ds rajib_maiti nils_tippenhauer @sutd.edu.sg @sutd.edu.sg @sutd.edu.sg SingaporeUniversityofTechnologyandDesign(SUTD) 8SomapahRoad 487372Singapore 7 1 0 ABSTRACT ner classifies and identifies devices that are communicating 2 usingtheinfrastructureandtrafficpatternsamongthepar- In the context of the emerging Internet of Things (IoT), a n proliferationofwirelessconnectivitycanbeexpected. That ticipating devices. The IoTScanner will supply this infor- a mation in a structured manner so as to provide valuable ubiquitouswirelesscommunicationwillbehardtocentrally J insights for technical support and home users alike. manage and control, and can be expected to be opaque to 8 endusers. Asaresult,ownersandusersofphysicalspaceare Whilerelatedworkusuallyfocusesondetectingeitherthe 1 threatened to lose control over their digital environments. infrastructure, or eavesdropping on traffic from a specific node, we focus on providing a general overview of opera- In this work, we propose the idea of an IoTScanner. The ] tions in all observed networks. We do not assume prior R IoTScanner integrates a range of radios to allow local re- knowledge of used SSIDs, preshared passwords, or similar. connaissance of existing wireless infrastructure and partic- C Inadditiontothis,theIoTScanneroperatespassively,with- ipating nodes. It enumerates such devices, identifies con- . nection patterns, and provides valuable insights for techni- out active probing or other transmission. Our emphasis is s alsoonprovidingreal-timeanalysisandvisualizationofthe c cal support and home users alike. Using our IoTScanner, [ we attempt to classify actively streaming IP cameras from scanned network, unlike some related work that focuses on offline analysis. other non-camera devices using simple heuristics. We show 1 We note that we do not consider physical layer effects that our classification approach achieves a high accuracy in v such as collisions and packet loss, goodput, or interference an IoT setting consisting of a large number of IoT devices. 7 betweennetworksinthiswork. Weleveragerecentdevelop- While related work usually focuses on detecting either the 0 ments in the area of cheap software-defined radio modules infrastructure, or eavesdropping on traffic from a specific 0 tohandlethephysicallayerofthewirelesstraffictoprovide 5 node,wefocusonprovidingageneraloverviewofoperations theLinklayertraffic. Asmostwirelesscommunicationstan- 0 inallobservednetworks. Wedonotassumepriorknowledge dards nowadays by default encrypt the Link layer payload, . of used SSIDs, preshared passwords, or similar. 1 we consider only Link layer traffic for analysis in this work 0 (without considering higher layer traffic). In particular, we 1. INTRODUCTION 7 aimtoaddresstheissueofprovidingcontrollabilityandpri- 1 InthecontextoftheemergingInternetofThings(IoT),a vacy to the user of an IoT environment. The IoTScanner : proliferation of wireless connectivity can be expected, with can provide details on the devices, the links between them v i predictionsreachingasmanyas500smartdevicesperhouse- and the amount of traffic in any environment in which it is X hold in 2022 [17]. Heterogeneous smart devices that re- placed. This information can then be used to classify and r quiretransparentInternetconnectivityneedtobeintegrated identify devices that could impact the privacy of the user. a intoacommoninfrastructure. Inparticular,communication Threecommunicationstandards(WiFi,BluetoothLowEn- standardssuchasZigbee,Bluetooth,BluetoothLowEnergy, ergy (BLE) and Zigbee) are considered in this work. andWiFiareexpectedtoprovidesuchinfrastructureeither We summarize our contributions as follows: as mesh networks, or traditional single-hop access points. • Weidentifytheissueofopaqueness inIoTs:integration Often, vendors will sell their own application gateway with ofheterogeneousIoTdeviceswillleadtoaplethoraof integrated access point, which to a certain degree hides the wireless standard and networks operated in parallel. underlying communication from the owner. Although wire- WithoutasystemliketheIoTscanner,itwillbediffi- less communications have many benefits in terms of usabil- culttokeepoverviewofcommunicationinaspace(in ity, flexibility, and accessibility, there are also security con- particular for third-party networks). cerns[34]. Amongthoseconcernsareprivacyandingeneral, controllability. Withthisgrowthinthesizeandcomplexity • We propose an abstract system design that allows to ofwirelessnetworks,appropriatetoolsarerequiredtobetter seamlessly integrate a range of radio devices with a understand the environment’s wireless traffic. scanning server. The server’s data can then be ac- In this work, we propose the idea of an IoTScanner. The cessedbyusersthroughaRESTfulAPI.Thedataob- IoTScanner is a system that allows for passive, real-time tainedviatheRESTAPIcanthenbeusedforvisual- monitoringofanexistingwirelessinfrastructure. TheIoTScan- izationofthescannedenvironmentandfurtheranaly- sis. whichdeviceusedforwhatpurpose,orwhatkindofencryp- tionisusedbythem,ifany. Weassumethattheuserofour • Wepresentanimplementationoftheproposeddesign, IoTScanner system does not have any control over the de- and demonstrate its effectiveness through a set of ex- vices, or know if they are present or where exactly they are periments. located. Hence, there is a chance that IoT devices that are present in the environment (but not actively participating • Wediscusshowoursystemcanbepotentiallyusedto in the network) are not detected by the scanner. identify privacy threats in an IoT environment. The structure of this work is as follows: In Section 2, we 2.4 AttackerModel brieflysummarizetherelevantbackgroundforourwork. We We assume a limited attacker who is honest but curious. propose our system design for the IoTScanner in Section 3. Forexample,theattackermightaccessawebcamsetupina OurimplementationisdescribedinSection4. InSection5, roomtospyonpersonsintheroom,buttheattackerwillnot we run some experiments to evaluate our implementation set up a dedicated device for this (in particular, the device in an environment that consists of WiFi devices. In addi- will not use some non-standard wireless communication). tion,wediscussthefeasibilityoftheIoTScannertoperform By obtaining the images from the camera over a certain trafficclassificationanddeviceidentificationinthissection. time period, the attacker violates the privacy of the user. Sections 6 and 7 describe experiments in BLE and Zigbee How the attacker obtained access to the camera does not environments. We discuss some of the challenges we faced matter for our analysis. inSection 8. Relatedworkandcomparisonoffeatureswith As we do not assume that we have control over the net- existing tools is summarized in Section 9. We conclude the work(s) in the environment, we will not be able to detect paper and discuss possible future work in Section 10. the camera activity on the Network Layer or at a gateway or similar. 2. BACKGROUND 3. APASSIVEANALYSISFRAMEWORK 2.1 InternetofThings In the following we provide details on our proposed pas- TheInternetofThings(IoT)referstoanetworkofphys- sive analysis framework. We start with a concise problem icaldevicesthathavethecapabilityofgatheringandtrans- statement, summarize the intuition behind our design and ferring data from the environment, and interacting with re- then provide additional details on individual components. mote computers over Internet [4,18]. IoT devices can be anything from cellphones to smart lamps, as long as they 3.1 ProblemStatement havetheabilitytotransferdataovertheInternet,whiches- sentiallymeanstheyareassignedsomepublicIPaddresses. Inthiswork,weaddressthefollowingproblem:“InanIoT environment, whichdevicesarepresent, andcommunicating 2.2 PassiveMonitoring via wireless communications?” That information can then Passivemonitoringisatechniqueforobservingthetraffic beusedtoaddressquestionssuchas:“AretheIoTdevicesin in a network only by listening to signals that are already myenvironmentusedbyanattackertoviolatemyprivacy?” availableinthenetwork,withoutinjectinganyextrasignal. Ideally, such a goal should be reached passively, without Unlikeactivemonitoring,passivemonitorsdonotprobethe activelyinterferingwiththeenvironment(whichcouldbea devices they are observing [22]. In the context of wireless public place such as an airport, hotel, or similar). network, the technique is more suitable as it only requires Thisproblemcannotbesolvedbynormalend-userseasily; a traffic interceptor to be physically present in the envi- itrequiresspecificsoftware,hardware,andtechnicalunder- ronment,withoutrequiringanywiretappingorso. Devices standing of wireless protocols. We refer to this challenge as calledsniffersormonitorsareplacedinthenetworktointer- wireless opaqueness for the end users: they are agnostic to ceptframestransmittedbydevicesintheirvicinity. Radios thewaynetworkingworks,whichlinksexists,andhowdata that can be used to sniff different standards (WiFi, Blue- flows in the neighborhood. tooth, Zigbee) are available on the market and can be used 3.2 Intuition for passive monitoring. Toaddresstheproblemstatedabove,wenowproposethe 2.3 SystemModel IoTScanner. The aim of the IoTScanner is to provide real- In our systems model, we assume that there are one or time, passive monitoring of an existing wireless infrastruc- morewirelessnetworksinanIoTscenarioinwhichthescan- ture that potentially constitute an IoT environment. The ner is placed. For example, in a smart home the electronic scanner will identify active devices that are communicat- gadgets like smart lighting can make use of Zigbee commu- ingusingthatinfrastructure,andattempttocategorizeIoT nication,personalizedhealthmonitoringcanuseBluetooth, traffic depending on features such as the volume of traffic Internet access can happen via WiFi communication. The observed. scenario can have one or more IoT devices, but not all the In particular, we use to following constraints to achieve devices need to be active (i.e., sending or receiving some passive monitoring. The IoTScanner: information)allthetime. Forexample,asmartbloodpres- sure monitor may get activated only about 2-3 times a day, • will not associate with any access point present, whereas smart light need to be active throughout the day. In the basic operation mode, the scanner needs no prior • will not perform active probing or fingerprinting, and knowledge of the infrastructure and network configuration intheIoTenvironment,i.e,noinformationisrequiredabout • will not decrypt the observed network traffic. 3.5 TrafficAnalyzer ThetrafficanalyzerscrutinizeseachLinklayerframe,cap- tured by the traffic interceptor, by parsing only the frame overhead(i.e.,headerandtrailer)harnessingonpassiveanal- ysisphilosophyoftheIoTScanner. Itextractsrelevantinfor- mation,suchasthesourceanddestinationaddresses,frame type and sub-type, SSIDs present, from those the frames to be used for targeted analytics. We note that the frames Traffic Interceptor Traffic Analyzer Data Storage Visualizer need to be treated on a protocol by protocol basis, because parsing Bluetooth LE frames is significantly different from Figure 1: Overview of the IoTScanner architecture, parsingWiFiframesorZigbeeframes. Inaddition,theana- which consists of four modules: traffic interceptor, lyzerrecordssomeadditionalusefulinformationsuchasthe traffic analyzer, data storage, and traffic visualiza- channel number on which the interceptor is capturing traf- tion. fic,sizeoftheframecaptured(inbytes)andthetimestamp atwhichtheframeiscaptured. Itsendstheextractedframe information to the data storage module on per frame basis, TheIoTScanneronlyobservesthenetworktrafficattheLink where actual analytics is performed. layer, and then analyzes this traffic using frame header in- formation. A more offensive active scanner would not need 3.6 DataStorage to follow those constraints. The data storage module, acting as a simple server, pro- The long term goal is to turn the IoTScanner into a con- vides a stable storage unit primarily for storing processed venient hand-held device. In the context of this paper, we information about individual frames. The storage can be are using a Raspberry Pi 3 [31] as platform, together with donebyvariousmeans,howeverweprefertoemployasim- devices such as Android tablets for visualization via a web pleandlightweightdatabasesystem. Inadditiontostoring application. the information, the system provides a simple interface for querying the database to carry out analytics. For exam- 3.3 IoTScanner: SystemArchitecture ple,thedatastorageinterfacecanberealizedusingasetof ThemainfeaturesprovidedbyourIoTScannerareubiqui- RESTfulAPIsthatcaneasilyaccessedoverInternet,andbe tous (wireless) signal interception, packet filtering, analysis usedbythird-partyapplications. TheAPIsareprovidedfor of the captured packets, and storage of the results. Finally, thefollowingfunctions: storageandretrievalofframeinfor- results will be accessible via APIs to be visualized through mation, generalized categorical queries on the DB, storage a web-based visualization system. and retrieval of analysis results. Figure 1 shows the overall architecture of our proposed 3.7 TrafficVisualizer Passive Analysis Framework, termed as IoTScanner. The IoTScanner has four main functional modules: traffic inter- The traffic visualizer provides a visual representation of ceptor (captures wireless signals), traffic analyzer (analyzes theobservedIoTenvironmentataparticulartimewindow. MACframes),datastorage(storesprocessedframesandre- A network, i.e., an IoT environment, can be viewed from sults), traffic visualizer (displays the status of network). differentperspectives,startingfromdevice-to-deviceconnec- Eachofthesemodulesisdescribedindetailinthefollow- tiontodeviceorlinkspecificinformationtosemanticsofthe ing. We provide a detailed comparison of the IoTScanner underline,possiblycollaborative,activityperformedbyIoT with other related projects in Section 9. establishment. In this paper, we consider a mix of all these perspectiveatahighlevelforaninitialcomprehensionofthe 3.4 TrafficInterceptor IoT environment where we display a graphical view of the underlyingnetworkalongwithsomesupportingdescription The traffic interceptor module provides flexible low level of the devices and the associated links, if any. We do not accesstothewirelessmedium. InthecontextofIoT,widely infer any application specific information, e.g., if the IoT is used protocols are 802.11/WiFi, Bluetooth, Bluetooth LE, used for maintaining room temperature by controlling AC, Zigbee, and Z-Wave; each being prevalent in particular ap- rather our interest in this paper is to create an inventory plication area(s). For example, the devices accessing In- of the devices present and the amount of traffic generated ternet primarily use WiFi, wearable/on-body devices (e.g., by each of those devices. Because the devices need not be bloodpressuremonitor,smartwatch)useBluetoothorBlue- sending or receiving frames all the time, the traffic visual- tooth LE to connect to their parent devices, smart home izerneedstoautomaticallyandperiodicallyupdatetoreflect appliances (e.g., smart meter) use Zigbee, and electronic any changes (in terms of both the devices and links) in the appliances (e.g., AC, fridge, fan) use Z-Wave protocols [4]. IoT environment under surveillance. Note that the visual- However,thereisnoclearlineofseparationonwhatdevice izer displays only those nodes, and accounts for only those can use which protocol, it primarily depends on the usage links that are present in the database currently. ofthedevicesortheenergyconsumptionbehaviorofthem. Therefore, it is important to have interception capabilities 4. SYSTEMIMPLEMENTATION leveraging multiple radios, either one radio for each proto- col, or software defined radios to cope up with the variety Inthissection,wepresentourimplementationofthefour of protocols available in the IoT spectrum. proposedIoTScannermodules(trafficinterceptor,traffican- If multiple channels of the same standard should be reli- alyzer, data storage, and traffic visualizer). An overview of ably monitored, it might even be necessary to use multiple our implementation is shown in Figure 1. We use different radios of the same kind, each on its own channel. wirelessinterfacesforthetrafficinterceptor,Scapy(apython library)forthetrafficanalyzer,SQLiteforthedatastorage, vantpiecesofinformationareextractedfromeverycaptured and Javascript for the data visualization. The data storage frame. This is performed online (without buffering) so as uses RESTful APIs to communicate with the analyzer and to find quick answers to only those questions that we have the visualizer modules. The interceptor and analyzer com- assumedtobesufficienttoprovideanoverviewof theenvi- ponentsrunonaRaspberryPi3device,thedatastoragecan ronment. Theextractormoduleextractsthefollowingfrom berunonthesamedeviceoraserver,andthevisualization the respective frames: is displayed on an Android Tablet. • In WiFi frames (type, sub-type, length, MAC address 4.1 TrafficInterceptor and SSID) The traffic interceptor module can be implemented by a • In Bluetooth LE frames (type, length, MAC address number of radio interfaces or by using software defined ra- type (public or random), MAC address, node local dios. Wedecidetousethefollowingradiointerfacesthatare name) commercially available, and connectable via USB: the TP- • In Zigbee frames (type, length, PAN ID, addresses ) Link TL-WN722N 802.11n wireless adapter (for WiFi), the UbertoothOne(forBluetoothLE),andAtmel-RZUSBstick Thecollectormodulecollectsinformationsuchasthesys- (for Zigbee). temtimeduringframecapture,thechannelnumberonwhich TheTP-LinkTL-WN722Nadapter[36]canbeeasilycon- the frame is captured, and the RSSI (for potential device figuredtooperateinmonitormodeandcaptureWiFiframes localization). Both of these modules supply the captured with configurable channel hopping (over 13 channels). The information to the storage handler module. adaptercanalsosupportcertainactiveattackssuchasdeau- The storage handler sub-module sends the collected and thentication attacks, a useful feature in case we extend the extracted information to the data storage module. It sends functionality of the IoTScanner to include traffic decryp- the frame information (in JSON format) to the data stor- tion. We perform sequential channel hopping to obtain an age module via HTTP POST method. The module also overview of the traffic on all channels. The interface dwells hastheoptionofstoringtheframesinaPCAPfileandpe- on a particular channel for a certain period of time before riodically sending the files to the data storage for further hoppingtothenextchannel. Thedwelltimecanbeconfig- analysis of the overall traffic (which might be more compu- ured by the user, otherwise takes a default constant value. tationally intensive). It is worth mentioning that high level Wenotethatsincewedwellonasinglechannelatatime, frame analysis is not possible at the traffic analyzer since frames on channels the interceptor is not listening on will this requires aggregated information from multiple frames, not be captured. Hence, we capture a subset of the overall and hence such analysis is performed at the data storage traffic. server end. For Bluetooth Low Energy (BLE) traffic capture, we use the Ubertooth One [29], an open source Bluetooth moni- 4.3 DataStorage toring platform. Compared to other Bluetooth monitoring The data storage module of the IoTScanner provides a solutions, this platform is inexpensive and easily adaptable database server, to be accessible via a set of APIs, to store to our settings. We focus on Bluetooth Low Energy (BLE) andretrievetheextracted/collectedframeinformation. Two trafficasitspresenceisincreasinginIoTdevices,especially modules,thetrafficanalyzerandthetrafficvisualizer,inter- those used in healthcare. The Ubertooth One is able to actwiththestoragemoduleusingtheseAPIsovernetwork, detect the channel hopping map from sniffed BLE connec- potentiallytheInternet. Wehaveimplementedthismodule tion request packets and follow connections by hopping in asawebserverwhichinteractswithalight-weightdatabase. the same pattern. The Ubertooth also has a promiscuous TheAPIsintheserveraredevelopedusingFlask(aPython mode—to follow connections that were already established web framework), which helps to easily build the RESTful atthetimeofsniffing. AsinthecaseofWiFi,theintercep- APIsforourpurpose. WeuseSQLitetobuildalight-weight tormaymissoutsomedevicesasresultofchannelhopping, relational database system for our data storage module. which may be compensated by increasing the observation period. 4.4 TrafficVisualizer For Zigbee traffic capture, we use the RZUSBstick from We implement the traffic visualizer using Javascript with Atmel [3] which supports low power wireless applications D3 [7] library for network visualization. The visualizer is usingZigbee,6LoWPAN,andIEEE802.15.4networks. This compatiblewithanyhand-helddevices,suchassmartphones, adaptercanalsoperformchannelhopping(over16channels, tablets, etc. in addition to the desktop browsers. The vi- fromchannel11tochannel26inthe2.4GHzband). Other sualizer displays the IoT environment in a number of ways featuresofthistrafficcapturingprocedureissimilartoWiFi (e.g., summary text, connectivity graph, bipartite relation, or Bluetooth networks. etc.) tomakeitsuitablefortheusertounderstanddifferent aspects of the underlying network. By default, it displays a 4.2 TrafficAnalyzer network graph accompanied by a brief summary text. Thetrafficanalyzermoduleconsistsofthreesub-modules: Figure 4 shows a sample network graph obtained during extractor, collector, and storage handler, as shown in Fig- one of our experiments. The colored circles represent the ure 3. The extractor and the collector modules are imple- nodesinthenetworkandthearrowsbetweenapairofnodes mented using Scapy [6], a python library for packet cap- indicate that the pair exchanged at least one frame. We ture and analysis. Every frame captured by the traffic in- identify the access points from beacon and probe request terceptor is an input to both the extractor and collector frames and internet gateways using simple heuristics. The sub-modules. Since the aim of the IoTScanner is to pro- access points and gateways have icons to identify them in vide a quick overview of the IoT environment, only rele- the visualizer. Figure 2: Overview of IoTScanner implementation. Figure 4: Example screenshot of our visualization Figure 3: Traffic Analyzer module of our IoTScan- app: IoT scenario represented using a graph struc- ner, for WiFi frames. ture. Selecting a node (red circle) in the graph dis- plays the details about the underlying device. Our visual display is interactive in the sense that on se- input parameters, lecting a node or an edge, more details about the selected componentaredisplayed. Forexample,intheFigure 4,the • DwellTime(T ): periodoftime(inseconds)thatthe d selectednodesandedgearehighlightedinredandtheirde- traffic interceptor listens on a channel before moving tailsaredisplayedinthesidebar. Thegraphisupdatedata tonextchannel(T ∈5,10,20,30(default),40,50,60). d fixed interval of p (configurable) seconds in order to reflect anychangesinthenumberofdevicesorthecommunication • Hops Th: number of channel hops performed by the links in the network. traffic interceptor (Th ∈1,6,13(default),26,65,130). OuroverallimplementationusingtheRaspberryPiasin- These parameters account for the amount of time the terceptor (along with the tablet as visualizer) is shown in IoTScanner is exposed to an IoT environment; for exam- Figure 5. ple,T =13andT =5simpliesthatthetrafficinterceptor h d scans for 13×5 = 65s, and the analysis will be performed 5. WIFIEXPERIMENTS only on the traffic captured during this period. In all our experiments, we place the scanner in an IoT 5.2 Evaluationmetrics testbed [33] that contains a number of IoT devices. We Following are the common metrics for our experiments: conductexperimentswithdevicesthatuseWiFi,Bluetooth LowEnergyandZigbeecommunication,withalargerfocus • nodes: the total number of active devices in the ob- on WiFi enabled devices since it is the predominant mode served environment (including access points). A de- of communication for IoT devices. We perform our exper- viceisconsideredtobeactiveifitisobservedtohave iments in three phases after grouping the devices based on sent/received at least one frame. networkingtechnology. Inthissection,wediscusstheexper- • links: the number of unique pairs of nodes that have iments using WiFi enabled IoT devices. Experiments with exchangedatleastoneframe(excludingbroadcastand BLE and Zigbee are described in Sections 6 and 7. multicast frames). 5.1 IoTScannerconfiguration • SSIDs: the number of access points seen in the envi- TheIoTScanner,whileinterceptingWiFitraffic,usestwo ronment. Label Device Name Dev-1 Desktop Dev-2 Netatmo camera Dev-3 TP-Link camera Dev-4 Access Point Dev-5 Gateway Dev-6 Amazon Echo Dev-7 Nest Cam camera Table 1: Device labels and the corresponding de- vices used in WiFi experiments. A: Bytes total mframes cframes dframes 4000000 3000000 2000000 Figure 5: IoTScanner with visualization on a hand- held device. 1000000 0 • Fprearmdeesv:icteh;etthoetsaelnaurembfuerrtohferfracmlasessifi(seedntbyortryepceeivinedto) Dev-1 Dev-2 Dev-3 Dev-4 Dev-5 Dev-6 Dev-7 cFrames(control),mFrames(management)anddFrames B: Frames (data). 7000 total mframes cframes dframes 6000 • Bytes: the aggregated number of bytes (sent and re- 5000 ceived) per device; these are further classified by type 4000 intocBytes(control),mBytes(management)anddBytes 3000 (data) 2000 5.3 ExperimentSettings 1000 NeIsntoCuarmcosnetcruorllietydceaxmpeerriam,eonntes,NweetautsmeosixcaImoTerdae(vwicieths:foancee 0 Dev-1 Dev-2 Dev-3 Dev-4 Dev-5 Dev-6 Dev-7 recognition feature), one TP Link security camera (of rela- tivelylowerresolutioncomparedtotheothertwocameras), Figure 6: Variation of amount of traffic per device one Amazon Echo wireless speaker, one desktop with wire- (subplot (A) in Bytes, and (B) in Frames) in high- lessadapter(toperformgeneralwebsurfing),andoneWiFi load setting. accesspoint. Weconduct10experimentseach,inahigh-load (being default setting) and a low-load setting. High-load. Thisisthedefaultsettingforourexperiments, observe that lower values of these parameters result in the and in this setting, all three cameras (focusing on the same scanner being unable to capture all the devices during the area) are actively streaming video via Internet to a mobile observation window. After multiple rounds of experiments, device located outside the test environment. The Amazon weconcludethatT =30andT =13areoptimalvaluesto d h Echo [2] loudspeaker is streaming audio songs continuously quickly capture a sufficient amount of traffic for the traffic during the experiments. The desktop with wireless adapter classification analytics we want to perform. Finally, we use is used to browse web pages intermittently. beaconandproberequestframestoidentifyaccesspointsin Low-load. In this setting, all the devices are present but the network, and simple heuristics (on amount of data and noneofthemareactivelyused. Forexample,theIPcameras destination MAC addresses) to identify the Internet gate- are switched on, but the live video is not accessed. The way. Amazon Echo is kept on but is not playing any music. Results of these experiments are not presented here, as they are used more as a check of the correctness of our Understanding the Network Structure. First, we ver- IoTScanner system. We present more interesting results ify if IoTScanner can identify the nodes (with their MAC that are obtained from the traffic analyzer of our system addresses) and the links among them from the captured on device classification in an IoT environment. traffic,hencedeterminingtheunderlyingnetworkstructure. WeobservethatourIoTScannercancorrectlycapturetraf- 5.4 Per-nodeTrafficClassification fic from all the six devices in each of our experiments. The We perform simple analytics on the captured traffic to scanneridentifiesthedevicesthatsentoutbroadcastframes classifyIoTdevices. Themappingbetweenthedevicelabels to advertise their presence in the network, and the wireless we use and their actual name is shown in Table 1. channelsonwhicheachof these devices sent/receive traffic. We experiment with various values (as noted earlier) of the Frames, mFrames, cFrames, and dFrames. First, we two input parameters, dwell time (T ) and hops (T ). We find the total amount of traffic associated with each device d h A: Bytes mainly consists of data frames, we investigate the amount 800000 total mframes cframes dframes ofdatatraffic(intermsofbytesandnumberofframes)sent 700000 andreceivedbyeachdevice(Figure8showsresultsinhigh- 600000 load settings). The highest amount of traffic (either sent 500000 or received) is observed in Dev-4, which is the access point. 400000 Dev-5 (the gateway) has about three times higher received 300000 200000 traffic (in Bytes) as compared to sent. Note that the traffic 100000 towards the Internet is the received traffic for the gateway, 0 and traffic coming into the local network accounts for sent 1 2 3 4 5 6 7 ev- ev- ev- ev- ev- ev- ev- for it. Thus, the result is consistent with the ground truth, D D D D D D D astherearethreecamerassendingvideotraffic. Wealsono- B: Frames ticethatthecameras(Dev-2,Dev-3 andDev-7)haveahigh 4000 total mframes cframes dframes amountofsenttraffic,asexpected. However,theamountof sent traffic varies significantly among the cameras. The re- 3000 ceivedtrafficishigherthanthesenttrafficforDev-6,which istheAmazonEchocontinuouslystreamingandplayingau- 2000 dio songs from its server. Our experiments indicate that 1000 active IoT devices such as IP cameras or music streaming devices can be identified by analysis of sent and received 0 1 2 3 4 5 6 7 traffic volumes in high-load settings. Dev- Dev- Dev- Dev- Dev- Dev- Dev- We also investigate traffic flow in the low-load settings (resultsshowninFigure9). Surprisingly,itcanbeseenthat camerasdonotnecessarilyproduceahigheramountofsent Figure 7: Variation of amount of traffic per device trafficcomparedtoreceivetraffic(e.g.,Dev-2). TheAmazon (subplot (A) in Bytes, and (B) in Frames) in low- Echosendsandreceivesalmostequalamountofdatainthis load setting. setting. As expected, the gateway still receives more data than it sends, probably because these IoT devices continue in the network, along with the type (management, control to update their status to their associated cloud servers. anddata)inthehigh-loadsetting. Wedeterminethetraffic Sent-to-Received Ratio. We explore the possibility of in terms of bytes (Figure 6A) and frames (Figure 6B). A identifying the devices by computing the ratio of sent to frameisassociatedwithadeviceifitsMACaddressisfound received traffic (in terms of both bytes and frames). We in the frame either as the source or destination address. consider only data traffic for this analysis, and ignore man- Interestingly, it can be seen that the traffic amount and agementandcontrolframes. Figures 10Aand10Bshowthe its type can classify the devices at a high level. For exam- sent-to-receivedratiosinhigh-loadandlow-loadsettingsre- ple, the highest amount of traffic, in terms of both bytes spectively. In the high-load setting, the IP cameras (Dev-2, and frames, is associated with Dev-4 which is the access Dev-3 and Dev-7) have a ratio greater than 4 for traffic in point(seeTable1)thatconnectstoallotherdevicespresent. bytes and greater than 1.5 for traffic in number of frames. Dev-5,whichhasnocontrolandmanagementframes,isthe ThisindicatesthatanIPcamerathatactivelystreamsvideo gateway device connected through Ethernet to the access trafficmaypotentiallybeidentifiedwhentheratioisgreater point. The lowest amount of traffic is seen in Dev-1, the than1.0. Also,theratioinbytesisgreaterthantheratioin desktop, and it is used for occasional browsing during the frames for the cameras, implying that, per frame, a larger experiments. The rest of the devices (Dev-2, Dev-3, Dev-6 amount of data is originated from the cameras. The desk- and Dev-7) are associated with high traffic as they are ei- top with adapter (Dev-1) has a lower ratio (>1.0) than the therIPcamerasortheAmazonEchoperformingcontinuous cameras but higher than the access point (≈1.0) and gate- streaming. In each of the devices, it can be seen that the way (<1.0). The ratio of frames is much higher than the datatraffic(inbytes)dominatesthecontrolormanagement ratio in bytes, which indicates that the desktop sends more traffic, and is almost equal to the total amount of traffic of number of frames of smaller size. The access point can be the corresponding devices. However, the difference between clearlyidentifiedasithasaratiocloseto1.0forbothbytes the number of control and data frames is not as high, es- and frames. The gateway (Dev-5) has a ratio less than 1.0, pecially in the case of the cameras. We observe that the whichindicateshigherreceivedtraffic. Finally,theAmazon acknowledgement frames contribute to the large number of Echo(Dev-6)showsaratiolessthan1.0;asitcontinuously control frames in this case. downloads audio traffic from the Internet. Weexplorethetrafficvolumeinlow-loadsettings(results In the low load setting, the sent-to-receive ratio does not showninFigure7). Itcanbeseenthatalmostallthedevices lookpromisingasametrictoclassifythedevices. Theratio have control traffic comparable to data traffic (in bytes), as does not behave in the same manner for all the IP cameras opposed to high-load settings where data traffic dominates - Dev-2 has a ratio less than 1.0, while Dev-3 and Dev-7 thecontroltraffic. Infact, standarddeviationoftrafficvol- haveahigheramountofsenttraffic. TheAmazonEchohas ume is quite significant in this setting in all the devices, aratiohigherthan1.0inthissetting. Ourexperimentsshow perhaps because the devices send their status information that an analysis of the ratio alone in low-load settings may more at times. Thus, an analysis of the traffic amount and not be good enough to identify IoT devices. its composition can potentially be used to learn if an IoT setting generates a high volume of data traffic. 5.5 ClassifyingStreamingCamera Sent and Received Volume. Since the overall traffic Finally, we present a use case of our IoTScanner system Table 2: Confusion matrix to identify streaming camera,“others”= devices other than cameras. n=95 classified as camera classified as others camera 10 2 others 3 80 andhops. Therefore, we consider ratio ondifferenttypes of traffic volume as it looks promising (see Figure 10). We perform this analysis only on data traffic as it domi- natescontrolormanagementtrafficinhigh-loadsettingsfor Figure 8: Variation of the number of frames and all the devices (Figure 6). its types for each participating device in high-load We use two parameters here: setting. • R - the ratio of sent to received data traffic, i.e., sr R =Tr /Tr ,whereTr andTr aretheamountof sr s r s r data traffic (in Bytes) sent from and received respec- tively by a device. • R - the ratio of the traffic volume in bytes and in bf frames for a device, i.e., R =Bytes/Frames where bf Bytes and Frames are the amount of traffic in bytes and the frames respectively, as defined earlier. OurexperimentalsetupconsistsofthethreeIPcamerasas beforeandsomeadditionalWiFienableddeviceslikesmart- phones,laptops,tablets,andaprinter. Weruneightexper- Figure 9: Variation of the number of frames and iments in total, having a varying number of active devices its types for each participating device in low-load withatleastoneactivecameraineachexperiment. Wehave setting. atotalnumberof95devicesspreadoveralltheexperiments. Thegoalistoclassifythecamerasthatareactivelystream- ing. The status (active or non-active) of a camera is not thataddressestheissueofprivacyinanIoTenvironment. In changed during a single experiment though it may change particular,weshowhowtodifferentiatenearbyIoTcameras across different experiments. thatarestreamingdatafromotherdevices. Toachievethat, Results from the previous experiments give us an R we leverage our experience on traffic patterns as discussed sr rangeof12±8andanR rangeof1000±500bytes/frame. in previous sections. It may be tempting to consider abso- bf Thus, a device is identified as an actively streaming cam- lutetrafficvolumeperdevicefortheclassificationasanyIP erainanexperimentonlyifitsatisfiestheconditions4.0≤ camera produces a lot of traffic compared to other devices R ≤ 20 and 500 ≤ R ≤ 1500. In each of the experi- (Figure 6). Unfortunately, there are a number of poten- sr bf ments, we calculate the R and R for every device, and tial issues with that approach. The traffic volume can vary sr bf check if they can be identified as a streaming camera, oth- significantly depending on the observation window size. In erwise it is identified as a non-camera device (denoted as addition,asdifferentwirelesschannelscanbeusedtocarry “others”). The results of these experiments are shown in a trafficandourscannerperformschannelhopping,itmaynot confusion matrix in Table 2. We see that out of 95 device beabletocapturealltrafficforeachandeverydevice. The identifications,therewere3falsepositivesand2falsenega- cameracanalsoproducevaryingamountoftrafficdepending tives. This gives a false acceptance rate (FAR) of ≈ 3.61% on its vendor (see Figure 8). Finally, the identification may and a false rejection rate (FRR) of ≈16.67%. becomedependentonsystemparameterssuchasdwelltime Falsepositiveidentificationofonedeviceoccursduetothe presenceofoneWithingsIPcamerainourtestsettings. The A: High Load B: Low Load camera is kept switched on and configured, but no remote 25 8 Bytes Frames Bytes Frames access of live video is performed during the experiments. 7 20 Hence, it is considered as a non-camera device. However, e e6 v v the camera starts streaming to its associated cloud server ei15 ei5 c c as soon as it detects some movement in the area of focus, e e4 nt/r10 nt/r3 causingittobedetectedasanIPcamera. Similarly,acou- se 5 se2 ple of false negative cases are reported due to the Netatmo 1 camera in our settings. This is because our scanner fails to 0 0 sniff sufficient traffic for the device, as it performs channel 1 2 3 4 5 6 7 1 2 3 4 5 6 7 ev- ev- ev- ev- ev- ev- ev- ev- ev- ev- ev- ev- ev- ev- hopping, making a false prediction case. D D D D D D D D D D D D D D As the aim of our study is to defend against an“honest butcurious”attackerthatmakesuseofexistingnetworking Figure10: Variationoftheratioofsentandreceived infrastructure to obtain insights about not only the IoT in- traffic, per device basis in both high- and low-load frastructure itself but also human users associated with it, settings. e.g., by accessing a surveillance camera, we use only MAC layer un-decrypted traffic for this purpose. Preliminary re- 120000 data data-control Control sults of our analysis on identifying such streaming cameras inanIoTenvironmentarepromisingandcanleadtofuture 100000 300 studies for identifying other devices from MAC layer traf- 250 200 fic obtained by passive sniffing. Thus, our IoTScanner can 80000 150 play an important role for initiating the process of address- x) 100 ( ing the problem of identifying potential privacy breaches in unt 60000 500 any personalized IoT environment. o 1 2 3 c 40000 Pair- Pair- Pair- 6. BLUETOOTHLEEXPERIMENTS For the Bluetooth Low Energy (BLE) experiments, we 20000 consider six devices - a OnePlus smartphone, an August smartlock,twoFitbits,andtwoLenovotablets. Thesmart- 0 1 2 3 phone operates the smart lock via an Android app. Each air- air- air- P P P tabletisassociatedwithaFitbit,andoperatesitviaanAn- droid app. In each experiment, one of the Fitbit is paired, allowed to sync its data and unpaired from the associated Figure 11: Variation of data and control frames (in- tablet; the smart lock is also paired, locked and unlocked dicated by x in count(x)), Pair-1 and Pair-2 are Fit- (3-4times)viacommandsfromtheapp,andunpairedfrom bit devices, and Pair-3 is smart lock device (inset the smartphone. Similar to the WiFi experiments, we con- plot shows data-control frames). duct 10 experiments in each case, and compute an average of each of the measures. We do not attempt to decrypt the Table 3: Label-device mapping for Zigbee experi- frames in these experiments as well. ments. Weinitiateexperimentstoprobeournetworksettingsby detecting BLE nodes and links. Then, we attempt to clas- Label Device Name sify the devices via traffic categorization. It is seen that Zig-1 Bulb - switched on and off once out IoTScanner is able to detect all three pairs of devices Zig-2 Hue Bridge andtheirlinks. Surprisingly,weobservethatthesmartlock Zig-3 Bulb - changed color advertisingframesdonotfollowBLEMACaddressrandom- Zig-4 Bulb - switched on and off multiple times ization (i.e., a feature expected to be used by BLE devices for privacy reason where the device MAC is replaced with some random MAC in the BLE frames). In the Fitbit case, periment(relationbetweentrafficandactivityaredescribed we do not detect advertising frames (refer to Section 8), in [12]). OurexperimentsshowthatBLEpairscanalsobe hence no information regarding MAC randomization is re- classified using traffic volume analysis, though this is not vealed. During the dataexchange phase, the access address done here due to insufficient number of devices at the time is seen to change every time the smart lock is paired. The of experiments. Fitbit pairs use a single access address across all pairing events. Hence, the access address can be used to identify 7. ZIGBEEEXPERIMENTS the Fitbit-tablet communication. In the smart lock case, we observe three types of con- For the Zigbee experiments, we employ the Philips Hue trol frames in addition to the advertising and data frames lighting system that consists of one Hue bridge controlling (BTLE DATA).Thosearescanrequest(SCAN REQ),scan three light bulbs. We intercept the traffic among the four response(SCAN RESP),andconnectionrequest(CONN REQ) devices and find the number of links, nodes and amount of frames. No such control frames are seen in the Fitbit case. traffic on each link. We conduct 10 experiments in total. In both cases, we observe that the data frames (observed In each experiment, we execute different commands on the onnon-advertisingchannels)arefurtherclassifiedintodata, three bulbs—one bulb is switched on and off once during control,andreservedsub-types. Wedenotethecontrolsub- the experiment, one bulb is switched on and off 6 times type frames as data-control (see Figure 11) . and one bulb is made to change its color 6 times. Each We observe traffic on all data channels in the case of the experiment lasts 120 seconds. The devices are labeled as Fitbit,exceptforafewexperimentswhereonlyonechannel shown in Table 3. is seen to be carrying traffic. However, for the smart lock, In all the experiments, our traffic interceptor detects all only about 4 channels on average carry traffic. fourdevices(resultsarenotpresentedhere). Weconcentrate Wepresenttheresultsonthedataandcontroltrafficper on the identification of devices using traffic analysis. BLE pair in Figure 11. Figure 12 shows the amount of data sent and received by Both the Fitbit pairs generate significantly higher data thefourdevices(eachindicatedbyZig-i). Wenotethatthe traffic when compared to the smart lock (the inset figure Hue bridge (Zig-2) sends and receives a higher amount of showsamagnifiedversionthatdisplaysdata-controlframes). data (in size and number of frames) when compared to the ThisisexpectedasFitbitsperformanumberofactionslike lightbulbs. Amongthelightbulbs,thebulbwiththelowest movement detection, quality of sleep estimation and loca- amount of activity (switching on and off once), i.e. Zig-1, tion information collection during every syncing with the has a smaller number of sent and received frames. phone, whereas the smart lock exchanges only lock/unlock The results also show that the controller receives about commands. The variation in data traffic for the Fitbit can 40%moredatathanitsends,whereasthebulbshaveabout be explained by the differences in user activity for each ex- 50%lessdatareceivedcomparedtosent. Theseinitialtests A: Bytes B: Frames 9. RELATEDWORK 800 30000 sent received sent received 700 In this section, we present work that is related to the 25000 600 problemofmonitoringcomplexwirelessnetworks,including ount(x)1250000000 ount(x)345000000 pWasisFiviemanodniatcotrivinegs.niffiInng[2s1y],stKemotsz. and Essien used syslog c10000 c 200 messages, SNMP polling and tcpdump packet captures to 5000 100 characterizeWLANusageonacollegecampusoveraperiod 0 Zig-1 Zig-2 Zig-3 Zig-4 0 Zig-1 Zig-2 Zig-3 Zig-4 obfy7c7apdtauyrsi.ngHternadceerss,oinnceltudailn.g[1V9]obIPuitltraucpeos,nfrtohmewaolrakrgoefr[2se1t] ofaccesspointsandusers. Intheseworks,thepacketswere Figure 12: Variation of sent and received data (A) captured by associating with access points and the trace bytes and (B) frames of four Zigbee devices, one analysis was done offline. Philips Hue controlling three associated light bulbs. Davisdevelopedapassivemonitoringframeworkin[13]to measureresourceusageon802.11bnetworks,anduseditto analyze various setups involving video streaming. Further work on resource usage during streaming was done in [27] showthatananalysisofdatasentandreceivedcanbecome and[26]. Yeoetal. implementedawireless monitoringsys- apotentialmetricfordeviceclassification. Itcanbeusedto temin[38], usingmultiplesniffersthatproducedamerged, distinguish a light controller from its associated lights and synchronizedtracewhichcouldbeusedforLinklayertraffic mightevenhelpdetecttheamountofactivityonthelights. characterizationandnetworkdiagnosis. Theyalsodiscussed thepossibilityofusinganomaliesinLinklayertrafficforse- curitymonitoring. Thechallengesposedbyanalyzingtraces 8. DISCUSSION from multiple sniffers was further explored in [23] and [9]. In this section, we discuss some of such challenges here In[23],theauthorsintroducedafinitestatemachinetoinfer that we have faced during our experiments. missingpacketsfromadistributedsystemofsniffers. In[9], Weobservethatourwirelessaccesspoint(LinkSysE1200) the authors focused on large scale monitoring by utilizing exposes two MAC addresses through the WiFi interface on 150 monitors to capture 802.11 frames. LiveNet [8] used certain occasions (for example, during our low-load exper- multiple sniffers to monitor sensor network deployments by iments). On scrutiny, we discover that these are the ad- reconstructingroutingbehaviorandnetworkloadfromcap- dressesoftheWiFiandEthernetinterfaces. Ourtraffican- tured traces. In [8], the authors proposed algorithms for alyzer module does not currently have a mechanism to cor- routeinferenceandtopologyreconstructionamongnodesin relatemultipleMACstothesameaccesspoint. Inaddition, a network and provided visualization of the network topol- when we use the OnePlus smartphone to interact with the ogy and data transfer. Chhetri and Zheng introduced the IP cameras, we observe that the IoTScanner detects multi- WiserAnalyzerin[10]—apassivemonitoringtooltocapture pleMACaddressescommunicatingwiththecameradespite wireless traces and infer relationships among nodes in the theabsenceofotherdevicesintheenvironment. Furtherin- network. Our proposed IoTScanner aims to provide simi- vestigation reveals that the Marshmallow Android OS used lar visualization but in real-time and for a larger number bythesmartphonecreatesrandomMACaddressestointer- of protocols. In comparison to these papers, where analysis act with the access point. Thus, it is important to devise a of the collected data was done offline, our work focuses on mechanism to detect multiple MAC addresses as belonging real-time passive analysis. to a single device. A real-time passive monitoring framework was developed During our BLE experiments, we observe that the smart by Benmoshe et al. [5] and deployed on a university cam- lock uses standard advertising BLE channels 37, 38 and 39 pus. Details such as number of clients, channel, error rate tosendcontrolframes. Outofthese,theconnectionrequest etc. were stored in a database and a map of active devices frame reveals the channel hopping sequence that is to be wasbuilt. Theyprovidedsomevisualizationintheformofa used during the data exchange phase. Therefore, the de- map of active devices. In contrast to [5], our work does not fault“followconnection”modeofUbertooth,whichusesthe havepriorknowledgeofthenetworksetup. SNAMP[37]was connectionrequestframetofollowconnections,canbeused a multi-sniffer and multi-visualization platform for wireless to detect this BLE pair. However, we do not observe com- sensor networks that could perform capture and visualiza- munication initiation on the usual advertising channels for tion in real-time. Our work aims to enhance some of the the Fitbit. Hence, we had to use the“promiscuous”mode features mentioned in these systems, with the introduction of the sniffer, which estimates the hopping sequence from of APIs for easy access to data, more visualization features frames on the data channels. This indicates the need to and monitoring of other protocols (Bluetooth and Zigbee). introduce pair specific sniffing policy in the case of BLE. Kismet[20]isoneofthemostwidelyusedreal-time,pas- In addition, we notice that the Android applications of sive sniffing tools. It is targeted at monitoring 802.11 net- the BLE devices do not exhibit the same behavior. The works but offers plugins for Bluetooth and Zigbee traffic smartlockapplicationdoesnotconnecttothelockwhenthe capturing. Thoughitprovidessomeanalysisintermsofenu- Bluetoothpairingisperformedmanuallyinsteadofthrough meratingthewirelessnetworks,hostsandamountofdatait theapplication,probablyindicatingsomesecuritymeasures sees, higher level analysis has to be done manually. In ad- at the application level. However, the Fitbit application ditiontothis,Kismetdoesnothaveadetailedvisualization works even if the pairing/unpairing is performed manually tool. Some tools have been built on top of Kismet, mainly during any Fitbit operation (such as syncing). We do not forthepurposeofvisualizingthenodelocations(usingGPS observe similar issues while working with Zigbee devices. plugins) but several of them are no longer maintained and

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.