ebook img

Impression Fraud in On-line Advertising via Pay-Per-View Networks PDF

17 Pages·2013·2.43 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Impression Fraud in On-line Advertising via Pay-Per-View Networks

Impression Fraud in On-line Advertising via Pay-Per-View Networks Kevin Springborn, Broadcast Interactive Media; Paul Barford, Broadcast Interactive Media and University of Wisconsin—Madison This paper is included in the Proceedings of the 22nd USENIX Security Symposium. August 14–16, 2013 • Washington, D.C., USA ISBN 978-1-931971-03-4 Open access to the Proceedings of the 22nd USENIX Security Symposium is sponsored by USENIX Impression Fraud in Online Advertising via Pay-Per-View Networks Kevin Springborn Paul Barford BroadcastInteractiveMedia BroadcastInteractiveMedia [email protected] UniversityofWisconsin-Madison [email protected] ABSTRACT Theonlineadecosystemcanroughlybedividedintothree groups: advertisers, publishers and intermediaries. Adver- Advertisingisoneoftheprimarymeansforrevenuegener- tiserspaypublisherstoplaceaspecifiedvolumeofcreative ation for millions of websites and mobile apps. While the contentwithembeddedlinks(i.e.,text,displayorvideoads) majority of online advertising revenues are based on pay- onwebsitesandapps. Intermediaries(e.g.,adserversandad per-click, alternative forms such as impression-based dis- exchanges)areoftenusedtofacilitateconnectionsbetween play and video advertising have been growing rapidly over publishers and advertisers. Intermediaries typically place a thepastseveralyears. Inthispaper,weinvestigatetheprob- surchargeonthefeespaidbyadvertiserstopublishersforad lem of invalid traffic generation that aims to inflate adver- placements and/or ad clicks. What is immediately obvious tising impressions on websites. Our study begins with an fromthissimpledescriptionisthatpublisherandintermedi- analysisofpurchasedtrafficforasetofhoneypotwebsites. aryplatformrevenuesaredirectlytiedtothenumberofdaily Data collected from these sites provides a window into the visitstoawebsiteorapp. Thus, therearestrongincentives basic mechanisms used for impression fraud and in partic- forpublishersandintermediariestouseanymeansavailable ular enables us to identify pay-per-view (PPV) networks. todriveusertraffictopublishersites. PPVnetworksarecomprisedoflegitimatewebsitesthatuse There are certainly legitimate methods for traffic gener- JavaScript provided by PPV network service providers to ation for publisher sites. The most widely used are the renderunwantedwebpages"underneath"requestedcontent text-based ad words that appear in search results e.g., from on a real user’s browser so that additional advertising im- Google or Bing. However, it can be quite difficult and ex- pressions are registered. We describe the characteristics of pensive to drive large traffic volumes to target sites using thePPVnetworkecosystemandthetypicalmethodsforde- ad words alone.1 Thus, other methods for traffic genera- liveringfraudulentimpressions.Wealsoprovideacasestudy tionhaveemerged,manyofwhicharedeededasfraudulent ofscopeofPPVnetworksintheInternet. Ourresultsshow by advertisers and intermediaries. Google defines invalid that these networks deliver hundreds of millions of fraudu- (fraudulent)trafficasfollows:2 lent impressions per day, resulting in hundreds of millions of lost advertising dollars annually. Characteristics unique Invalid traffic includes both clicks and impres- totrafficdeliveredviaPPVnetworksarealsodiscussed. We sions that Google suspects to not be the result conclude with recommendations for countermeasures that ofgenuineuserinterest[21]. canreducethescopeandimpactofPPVnetworks. Standard methods for generating invalid traffic includes (i) using employees at publisher companies to view sites and 1. INTRODUCTION click on ads, (ii) hiring 3rd parties to view sites and click Advertisingisoneoftheprimarymethodsforgenerating onads,(iii)click/viewpyramidschemesand(iv)usingsoft- revenues from websites and mobile apps. A recent report wareand/orbotnetstoautomateviews/clicks[21].Thechal- from the Internet Advertising Bureau (IAB) places ad rev- lengesforadvertisersandintermediariesfocusedonoffering enuesintheUSforthefirsthalfof2012at$17B,whichrep- trustworthyplatformsaretounderstandtheseandpotentially resents a 14% increase over the previous year [15]. While other threats so that effective countermeasures can be de- the majority of that revenue is search-based, ad words ad- ployed. vertising,displayandvideoadvertisinghavebeengrowing. Inthispaper,weinvestigatearelativelynewthreatfordis- Indeed, a recent report places display and video advertis- play and video advertising called Pay-Per-View (PPV) net- ingintheUSat$12.7BforFY2012,growingat17%annu- 1ThishasledtotheemergenceofalargenumberofSearchEngine ally[27]. Atahighlevelthebasicnotionofsellingspaceon Optimizationcompaniesinrecentyears. webpagesandappsforadvertisingissimple. However,the 2WhileGoogleisnottheonlycompanyinthisdomain,wereferto mechanisms and infrastructure that are required for online themasanauthoritativesourceofinformationduetotheirsizeand advertisingarehighlydiverseandcomplex. experienceinonlineadvertising. 1 USENIX Association 22nd USENIX Security Symposium 211 works. ThebasicideaforPPVnetworksistopaylegitimate canbecompiledthroughprogrammaticenumerationofPPV publishers to run specialized JavaScript when users access destinations.Finally,refererfieldscanbequeriedatthetime their sites that will display other publishers websites in a ofadvertisementloadinordertoidentifytrafficoriginating camouflagedfashion.Thiswillresultinimpressionsandpo- fromknownPPVdomains. tentially even clicks that are registered on the camouflaged The remainder of this paper is organized as follows. In pageswithout"genuineuserinterest"i.e.,invalidtrafficgen- Section 2 we provide a description of the online advertis- eration. Legitimate publishers view this as another way to ingecosystemandanoverviewofinvalidtrafficgeneration monetizetheirsiteswithoutimpacttotheirusers. PPVnet- threats. In Section 3, we describe the details of our hon- works sell their traffic generation capability by touting real eypot websites and our traffic purchases for these sites. In anduniqueusers,geolocationandcontextspecificityamong Section4,wedescribethedetailsoftheevaluationsthatwe otherthings. Thefactthatpagesareappearingonrealusers’ conduct on our data including analyses of additional data systemsmakesdetectingandpreventingPPVtrafficgenera- setsandmeasurementsthatenableustoprojectsomeofthe tionchallenging. broader characteristics of PPV networks. We provide rec- TostudyPPVnetworks,weemployasmallsetofhoney- ommendations for counter measures that can be employed potwebsitesthatweuseasthetargetfortrafficgeneration. toreducetheimpactofPPVnetworksinSection5. Wedis- These sites were constructed to include what appears to be cuss prior studies that inform our work in Section 6. We legitimate content and advertising. We then use search to summarize,concludeanddiscussfutureworkinSection7. identifyawidevarietyoftrafficgenerationofferingsonthe Internet.Wepurchasedimpressionsforourhoneypotsitesin 2. ONLINEADVERTISINGECOSYSTEM variousquantitiesfromaselectionofdifferenttrafficgener- Inthissectionweprovideanoverviewoftheonlineadver- ationservicesoverthecourseofa3.5monthperiod. Byen- tisingecosystemincludingboththebusinessframeworkand gagingwithtrafficgenerationservicesdirectly,wewereable technical framework for delivering advertisements to pub- touncoverthebasicmechanismsofPPVnetworksandini- lisherwebsitesandapps. Somepriorstudieshaveprovided tiate additional measurements to characterize their deploy- similaroverviewsincluding[16,34,41]. Wealsoprovidean ments. overview of invalid traffic generation threats and the chal- Thecharacteristicsofthetrafficpurchasedforourhoney- lengestheyposeintheecosystem. pot sites is dictated at a high level by the service offerings, which enable volume, time frame and geographic location, 2.1 BusinessFramework etc. of users to be specified. Our results show that impres- As mentioned in Section 1, there are three main partici- sionsaretypicallyspreadinasomewhatburstyfashionover pantgroupsinadnetworks: advertisers, intermediariesand thespecifiedtimeframeandthatusercharacteristicsarewell publishers. As shown in Figure 1 there are two other im- matchedwithspecifications. Byconsideringtherefererfield portant groups: brands and users. Brands pay advertisers oftheincomingtraffic,wewereabletoidentifythefactthat tohelpthemselltheirproductsandservices. Internet-based our honeypot sites were being loaded into a frame (along campaignsareattractivetobrandsandadvertiserssincecon- with as many as ten other sites) for display on remote sys- sumers/users spend a growing proportion of their time on- tems. By considering names of a small selection of traf- line. An important appeal of online advertising (especially fic generation services, we use a recent, publicly-available, for consumer goods) is that it offers the opportunity to tie Internet-wide web crawl to identify the scope of PPV net- ad campaigns and associated costs directly to sales e.g., by works. Wefindtagsfromtheseservicesare,infact,widely tracking clicks from online ads to purchases on a brand’s deployed – on tens of thousands of sites. By appealing to ecommercesite. MuStat [29], we conservatively estimate the number of in- Advertisersarecompaniesthatcreateandmanageadver- valid impressions that are generated from this small set of tising campaigns for brands. Advertisers pay publishers to PPVnetworkstobeontheorderof500millionperday. As- makeadplacementsonwebsitesandappsusingoneofsev- sumingamodestqualitylevelforsitesthatarepartofPPV eraldifferentmodels. OneisthewidelyusedPay-Per-Click networks,weestimatetheannualcosttoadvertisersforthis (PPC)model,whereanadvertiseronlypaysapublisherfor invalidtraffictobeontheorderof$180millionannually. anadwhenauserclicksonit. PPCcampaignsaretypically Finally,weofferthreedifferentmethodstodefendagainst associatedwithadwords(short,text-basedads)campaigns. PPV networks. First, observing viewport dimensions of ad An alternative payment method that is common in display requestscandetermineiftheendusercanpossiblyviewthe and video advertising is Cost Per Mille/Thousand (CPM), advertisement. Inanefforttoincreasetraffic,PPVnetworks where advertisers pay publishers whenever users view an commonly display destinations in zero sized frames. Sec- ad (CPM prices are given per thousand impressions). The ond,blacklistsofwebsitesthatparticipateinPPVnetworks CPM-basedpaymentmodelistheprimaryfocusforthispa- canpotentiallybeused. Theideaistoblockadvertisingon per. The goal for advertisers is to place ads on sites that websites that commonly receive PPV traffic until the pub- they believe attract a brand’s target demographic in a cost- lisher discontinues purchasing PPV traffic. Such blacklists effectivefashion.Thus,theirchallengeisinidentifyingthese 2 212 22nd USENIX Security Symposium USENIX Association Figure 1: Key participants in the online advertising ecosystem. Paymentsflowfrombrandstoadvertisersto intermediariesandpublishers. Figure 2: Typical data exchanges required to render an adinauser’sbrowser.(1)Userrequesttopublisherpage. sitesandfacilitatingadplacement. (2) Base page delivered. (3) Ad tag request to CDN. (4) Inadditiontoworkingwithpublishersdirectly,advertisers JavaScript delivered. (5) Update to JavaScript in CDN oftenworkwithintermediariesinordertoactuallyplaceads ifnecessary. (6)Requesttoadserver. (7)Redirectedre- onwebsitesandapps. Thetwomainreasonsforthisarethe questdelivered. (8)Requesttoexchangeor3rdpartyad complexityofInternetadvertising’stechnicallandscape(see server. (9)Adcreativedelivered. below)andtheenormousandgrowingdiversityofwebsites and apps. Among other things, intermediaries offer "one- hostedinaCDNinfrastructure. stopshopping"foradvertisers,andcompetitiveCPMratesto TheJavaScripttypicallygatherscontextkeywordsandother publisherswhomaynotbeabletofillalloftheirplacements information from the publisher page or user browser and viadirectcampaigns. then sends an ad request to the target ad server infrastruc- Thescopeofintermediariesisquitebroad.Themostcom- ture. Ad servers process the ad request and either respond mon offerings include targeting services, ad servers and ad withanaddirectly(e.g.,fromadirectadvertisercampaign) exchangestofacilitateplacements. Oneofthemostwidely or send a redirect to a third party such as an ad exchange. usedintermediariesinthedisplayadvertisingspaceisGoogle Theredirectisforwardedbybrowsertothetargetserveror AdExchange(AdX)[20,30].Therevenuemodelthatismost exchange,whichwillrespondwithanadthatisrenderedin commonly used by intermediaries is to take a small CPM the browser. The redirect usually includes sufficient infor- paymentforeachadthattheyparticipateinservingandthen mationforadtargetingandbilling. Thisentireprocessmust to pass the remainder of the CPM paid by the advertiser to takeplacequickly(typicallyontheorderoftensofmillisec- thepublisher. onds) in order to ensure a good user experience. When the Internetpublishersarecompaniesthatcreatecontentthat ad is delivered, an impression is registered for the ad serv- isofinteresttousers. Publishersdisplayadsontheirpages ingentity. Clicktrackingistypicallymanagedbydirecting using standard sized creatives that typically appear in an clickstotheadserver,whichthenredirectstotheadvertiser. iframe.Apublisher’sgoalistomaximizetheirrevenueyield by attracting (i) premium advertisers that pay high CPM’s 2.3 InvalidTrafficGenerationThreats and(ii)ahighvolumeofusers,somewhomwillclickthrough onads. Itisimportanttonotethatwhileadwords-basedad- Impression-based advertising has a number of potential vertising(e.g.,throughAdSense)iswidelyavailable,display threats. Thefocusofthispaperisontrafficgenerationthat and video ads are typically only available to sites that have causesinvalidimpressionandtherebyinflatespublisherand somewhathighervolumesofusers. (some)intermediaryrevenues. Specifically,wefocusonin- validtrafficgenerationviaPPVnetworks,whichwedescribe 2.2 TechnicalFramework indetailinSection4. Displayinganadvertisementonapublisher’spageincludes Validmethodsfortrafficgenerationincludesearchandad potentially a large number of data exchanges between par- words-basedadvertising. However, websearchrevealsthat ticipantsintheadvertisingecosystem. Asimpleexampleis there is a wide variety of other traffic generation offerings depictedinFigure2. Theprocessbeginswiththeplacement available. Manyofferaspecifiedvolumeoftrafficatatarget of an ad tag in a section of a publisher page. Ad tags (of- siteoveraspecifiedtimeperiod. Manyalsoincludeguaran- ten supplied by intermediaries that manage ad servers) are teesofspecificfeaturesinthetrafficsuchasgeographiclo- simpleHREFstringsthattypicallyreferenceJavaScriptcode cationsofhostsystems. Mostdonotdescribetheirmethod- 3 USENIX Association 22nd USENIX Security Symposium 213 ology in detail if at all. One of the important objectives of traffic generation is that it appear to come from real users. Appealing to the definition of invalid traffic given in Sec- tion1above,therearemanywaysinwhichsuchtrafficmight begenerated. Commonmethodsforinvalidtrafficgenerationhavebeen borrowed directly from click generation services that have been offered for some time. Examples include hiring peo- ple to view pages, bots of various types, and using expired domainstodivertusersto3rd-partypages. PPV networks are sites that load 3rd-party pages in an obfuscatedfashionwhenaccessedbyusers. Publishersbe- comepartofaPPVnetworksimplybyplacingatagontheir Figure3:Screenshotofoneofthehoneypotwebsitesthat sitethatlooksverymuchlikeastandardadtag. Wedefine wasatargetfortrafficgenerationpurchases. a "network" as a series of sites that run tags from the same PPVservice. ParticipatingpublishersarepaidonaCPMba- tisementplacements,identicaltostandardCPMplacements sisforsomethingthatappearstobelowornoimpactontheir excepttheycontaineddummycreativesinsteadofdisplaying site. payingadvertisers’placements. Alloftheadshaveembed- SincethethirdpartypagesthatarerenderedviaPPVnet- dedlinkstodummylandingpagesthatwealsomonitor. works are clearly not the interest of the users, all of the re- DomainnameswereregisteredforeachsitewithGoDaddy sulting impressions are invalid. Beyond laking the intent usingtheiranonymousregistrationoption. Weattemptedto necessarytoqualifyasvalidtraffic, weshowthatPPVnet- give the sites names that sounded interesting and connoted worktraffichascharacteristicsunlikeorganictraffic.Forex- the news-related content of the sites. The sites were cre- ample,naturaltrafficdisplaysadiurnaltrafficpattern,while atedusingdotCMSinsideAmazonEC2. Amazon’sCloud- the PPV traffic we observed often showed highly artificial FrontCDNwasenabledforthesitesinordertohandlelarger deliverypatterns. burstsoftraffic.Weuseda"noindex,nofollow"metatagand arobots.txtfiletoattempttopreventinclusioninsearchen- 3. DATACOLLECTIONONHONEYPOT gineresults. WEBSITES Instrumentation was facilitated in several ways. Google To begin our investigation of traffic generation and im- Analyticstagsweredeployedonallpagesforgeneralmon- pression fraud we established a set of honeypot websites. itoring. Logs from the serving infrastructure were used to We then purchased traffic from a number of different ser- understandthedetailsofindividualconnections. Aseriesof vices and captured a diverse set of data from the resulting JavaScript blocks collected information about the site vis- hits on our sites. In this section we provide details on our itors. The instrumentation reported viewer characteristics honeypotwebsitesandtrafficpurchases.Theresultsofthese (See Table 1) using 1x1 pixels. Each advertisement on the activitiesaredescribedindetailinSection4. siteswasinstrumentedwithcodethatreportedthethreekey events in the life cycle of every ad: (1) JavaScript load (2) 3.1 HoneypotWebsites JavaScript execution and (3) successful delivery. Finally, Wecreatedthreewebsitesasthestartingpointforourin- the pages contained JavaScript that tracked user interaction vestigationoftrafficgenerationserviceproviders. Thesites on the site. Simliar to [41] the interaction metrics reported differed only in styling, formatting, and deployment. The mousemovementsandclicks. Themousepositionwascol- content on each site was identical. The reason for creating lectedeverytimethecursormovedatleast20pixels. three different sites was to enable us to conduct A-B com- 3.2 PurchasedTraffic parisonsbetweendifferenttrafficgenerationservices. Thedesignobjectiveforourhoneypotswastocreatesites We identified and reviewed 34 traffic generation service thatlookedrelatively"legitimate". Tothatend,theyhavea providersforthisstudy. Theseserviceproviderswereiden- standard layout, content changes regularly and the deploy- tifiedusingwebsearch. Wemanuallyreviewedeachservice mentisstandard. Asecondobjectivewasthatthesiteswere provider’ssitetocatalogavailablepurchasingoptions. De- instrumentedtogatherasmuchdataaspossibleonarriving tailsofthesitesandoptionsaregiveninTable3. Wemake traffic. no claims on the completeness of this list of traffic genera- Each site consisted of a base landing page and four sub- tionserviceproviders. However, giventhecommonalityof pages. Three of the pages displayed RSS content from the theirofferings,webelievethattheyarearepresentativecross news feeds of topwirenews.com or espn.com. One page section. listedlinkstopopularnewssites. Thefinalpagewasanon- Wealsoinvestigatedtheserviceproviderwebsitesthem- functional search result. Every page contained four adver- selves to gain some insights on their legitimacy. Their do- 4 214 22nd USENIX Security Symposium USENIX Association main names were checked with McAfee SiteAdvisor [6]. at$1.80)andthat"...popunderadswillnotblockanyofyour The DNS record was inspected using Network Solutions’ sitecontentanddonotleadtoactionswhereusersmightbe Whoistool[8].Finally,atoolavailablefromSameID.net[9] ledtoleaveyoursite."[23].Inthiscase,pop-underwindows was used to search for sites sharing the same IP address or arethemethodthatInfinityAdsusestogeneratetraffic. We GoogleAnalyticstag. describetheseinmoredetailbelow. 4. PAY-PER-VIEWNETWORK Table 1: Visitor information collected from honeypot CHARACTERISTICS websites. Inthissectionwereporttheresultsofouranalysisofpur- Timestamp ClientIP chasedtrafficatourhoneypotsites. Thisanalysisrevealsthe URL UserAgent mechanismsusedtodrivetraffictotargetsitesandopensthe UserUID PageLoadUID door to a broader analysis of PPV networks, which is also ViewportDimensions Referer reportedbelow. Fromthesetof34trafficgenerationservices,weselected 4.1 TrafficGenerationOfferings 5fromwhichwemadepurchases. Serviceswereselectedto We reviewed the details of the 34 traffic genera- get a diversity of delivery rates and price points. The char- tion/ecommercesitesthatweidentifiedviawebsearchusing acteristicsofourpurchasedtrafficindicatedtheselectedser- stringslike"websitetraffic","buyingwebtraffic","webtraf- viceswereindependentnetworks. Thepurchasedtrafficwas ficking",etc. Featuressuchastrafficcharacteristics,pricing, directedtothehoneypotsitesbetweenNovember11th,2012 timing,resellerinformation,andDNSentrieswerenotedfor andFebruary18th,2013,resultinginover69Kdeliveredim- eachsite. DetailsarelistedinTable3. pressions.WeusedtargetURL’sincludingGoogleAnalytics campaignparameters[5]tohelptodifferentiateoverlapping 4.1.1 Pricing purchases. Thereisnouniformpricingfortrafficproviders.Thepric- Ourpurchasingstrategywasorientedarounddiversityand inggiveninTable3wasnormalizedtothecostofdelivering notvolume. Detailsofthepurchasedtrafficcanbefoundin 25,000 visitors from the United States for comparison. Of Table 2. With the exception of BuildTraffic all traffic pur- the 34 traffic generation services that we investigated, five chasedwasdesignatedasonlytrafficfromUnitedStatesand of them did not allow purchasing traffic originating exclu- labeledasnewsandinformation. Theintendeddeliveryrate sively from the United States. One site was deemed fraud- of purchased traffic varied between 333 visitors per day to ulentbecauseitdidnothaveaspacetoenteratrafficdesti- 25,000visitorsperday. Weintendtoinvestigatefurtherdi- nationpriortocheckoutcompletion. Theremaining28sites versityandhighervolumepurchasesinfuturework. chargedbetween$29.99and$200topurchase25kvisitors. 4.1.2 Overlap/Reselling Table2: Trafficpurchasesmadeforthisstudy. There were significant similarities between many of the Vendor Amount Runtime Price traffic purchase sites. Many of the providers made mul- MaxVisits 10,000 5days $11.99 tiple copies of their site in order to target different pub- BuildTraffic 20,000 60days $55.00 lisher segments or to simply use another attractive domain AeTraffic 10,000 7days $39.95 name. Alloftheproviderdomainswereassessedusingthe BuyBulkVisitor 20,000 5days $53.00 sameid.net domain investigation tool [9]. Seven of the TrafficMasters 50,000 2days $70.00 providers appeared to be repackaging another site (handy- traffic, cmkmarketing, visitorboost, revisitors, buybulkvis- itor, highurlstats, xrealvisitors). Four of the repackaged 3.3 Pay-Per-ViewPublisherSignup sitessharedaGoogleAnalyticsaccountwithanothertraffic In addition to traffic generation itself, PPV service provider site (handytraffic, cmkmarketing, visitorboost, re- providersalsoofferpublisherstheopportunitytoparticipate visitors). ThreeoftherepackagedsitessharedanIPaddress asatrafficsourceintheirnetwork(thiswasourinitialindi- with another traffic purchase site (buybulkvisitor, highurl- cationofPPVnetworks). Tofurtherinvestigatethemecha- stats, xrealvisitors). Shared website hosting could cause IP nisms of traffic generation, we enrolled as a website owner overlap, but it is unlikely that 3 sites in our 34 site sample willingtodisplaycontentwithaPPVserviceprovidercalled arerandomlyhostedonthesameIP.Furthermoreanimple- InfinityAds. The signup was completed using InfinityAds’ mentationerrorcausedhighurlstats.comtoloadbuybulkvis- fully automated publisher signup system on their website. itor.com,makingitplausiblethatthesesitesarerelated. UponsignupweweregivenablockofJavaScripttoloadon FourofthePPVsellersinvestigatedofferedtheabilityto oursite. Inreturnforrunningthistag,thewebsiteowneris become a traffic reseller (hitpro, ineedhits, toptrafficwhole- assuredofarelativelyattractiveCPM(quotedandqualified saler,traffic-masters). Aresellersellstrafficwithouthaving 5 USENIX Association 22nd USENIX Security Symposium 215 Table3: Trafficproviderdetails. Site Price2 Geotargeting Category Pacing Adult AllowPop-up/Sound aetraffic.com $75 Yes Yes Option Option Yes2 allseostar.com NA No No No Opion2 No bringvisitor.com NA No No No ? Yes2 buildtraffic.com $119 Yes Yes 30days ? No buybulkvisitor.com $53 Yes Yes Option ? No buyhitscheap.com $110 Yes No No ? Yes cheapadvertising.biz NA No No No Option ? cmkmarketing.com $82 Yes Yes No ? No cybertrafficstore.com $70 Yes Yes 30days Option ? easytraffic.biz $100 Yes Yes 60days ? No fulltraffic.net $220 Yes No No ? ? getwebsitetraffic.org $75 Yes Yes Option Option Yes2 growstats.com $84 Yes Yes Option ? Yes2 handytraffic.com $99 Yes Yes Option2 ? Yes highurlstats.com $200 Yes Yes 30days ? ? hitpro.us $60 Yes Yes 30days ? No ineedhits.com $120 Yes Yes 30days ? No masvisitas.net Noinformation,nowheretoenterwebsiteURL maxvisits.com $30 Yes Yes Option ? Yes meantraffic.com $30 Yes Yes No Option2 ? perfecttraffic.com $43 Yes Yes Option ? ? plusvisites.com $30 Yes Yes Option ? ? purchasewebtraffic.net $99 No No No ? No realtrafficsource.com $55 Yes Yes No ? ? revisitors.com $119 Yes Yes Option2 Option2 Yes2 source4traffic.com $88 Yes Yes 30days ? No thewebtrafficdominator.com $32 Yes Yes No Option2 ? toptrafficwholesaler.com $111 Yes Yes2 30days Option2 No traffic-masters.com $35 Yes Yes Option Option2 No trafficchamp.com $89 Yes Yes 30days No No trafficelf.com $55 Yes Yes Option Option2 Yes trafixtech.com $35 Yes Yes Option Option2 No visitorboost.com $116 Yes Yes 30days ? No xrealvisitors.com NA No No No ? ? 1Costtopurchase25,000UnitedStatesvisitors(normalizedwhereneeded) 2Extracost tomanagetrafficdeliveryinfrastructureorpaymentprocess- the34sitesinvestigated22werelabeledasSafe,11hadnot ing. The reseller acts only as an intermediary forwarding yetbeenreviewedbySiteAdvisor,and1waslabeledassus- ordersalongtothetruetrafficprovider. Asperthedescrip- picious. tions, the reseller is charged a fixed rate for the traffic and 4.1.4 DNSRegistration can resell the traffic at the price of their choosing. Two of the reseller packages offered prepackaged websites where A Whois lookup was performed on each of the traffic thereselleronlyneedstosupplytheirbrandingandmarket- providerswebsitestogaininsightsondeployments. 14out ing. of the 34 sites listed a DNS anonymization service as their primarycontact.Fourofthesiteswereregisteredorrenewed 4.1.3 ProviderSiteAnalysis in the previous 12 months. Expiration and creation dates Given the potentially fraudulent nature of traffic genera- givetheperiodthedomainregistration. Onaveragethesites tion,wewereinterestedinageneralmeasureofthetrustwor- wereregisteredfor5.71years. Thelongestregistrationwas thiness of providers sites. McAfee’s SiteAdvisor [6] rated for16years. Sixsitesareregisteredforonly1year. most of the provider websites as safe. Specifically, out of Looking at the contract information of the sites not us- 6 216 22nd USENIX Security Symposium USENIX Association inganonymizationgavethefollowingbreakdownofcountry addressesfromthesesourcesaveraged303,968(or0.007% residency: 10UnitedStates,2Australia,2Canada,2Spain, oftheentireIPspace)forJanuary2013. Onaverage,source 1France,1Italy,1Singapore,1China. IP addresses of the purchased data matched the blacklists 0.97%ofthetime. Thisisperhapsmorethanwouldbeex- 4.1.5 Features pected by chance, but too low to draw a strong conclusion Providers offer a variety of options for purchased traffic. aboutoverlapbetweenthesetofsourcesfromtrafficgener- Manyprovideassurancesthatonly"real"trafficwillbede- ationservicesandmalicioussources. livered and no "black hat techniques" are used. Every site promises unique views, such that the same user will not be 4.2.2 Interaction directedtothesitemultipletimesin24hours. Sixsiteswere EachofourhoneypotpagestrackedfourJavaScriptevents: moreprecise,specifyingthatauser’sIPaddresswillonlybe onmousemove, onmousedown, onblur, onfocus. There was directed to the destination once in a 24-hour period. Typi- anextremelysmallnumberofactivityevents(190)reported cal traffic volumes range between 10K and 1M visitors per for all purchased traffic. There are a few explanations for campaign. Direct email was required for campaigns larger such low interation: (i) it may be an accurate reflection of than 1M visitors. See Table 4 for other options offered by reality,(ii)thesitewas0sizedandtheusercouldnotinteract thetrafficprovidersthatweevaluated. withit(see4.2.7)or(iii)itcouldbetheresultofJavaScript eventsnotfiringasexpected. Unfortunatelywecannotrule Table4: Trafficproviderfeatures. out JavaScript failure. We cannot draw strong conclusions from the lack of interaction events other than the fact that AdsenseSafe SafetousewithGoogleAdSense wedidnotpayforanythingotherthanimpressions. AdultTraffic Deliverusersinterestedinporn AlexaBoost TraffictoincreaseAlexaranking 4.2.3 TemporalDistribution AllowPop-ups/Sound Norestrictionsondestination The pacing of visitor delivery varied greatly depending CampaignPacing Selectlengthofcampaign on traffic service provider. As is described below service Geo-targeting Deliverusersfromaregion providers traffic millions if not billions of visitors a day, Clicks Deliverclicksontargetwebsite but individual purchases can require delivery of less than MobileTraffic Deliverusersofmobiledevices 100 visitors a day to a destination. Furthermore, the net- TrafficClasses Deliveruserswithspecificinterest work throughput is not guaranteed. So the deliveries need tobeslightlyfront-loadedtoensurefulldeliveryinthecase 4.2 PurchasedTrafficCharacteristics of lower than expected throughput. The problem of pacing manifesteditselfinboththetimeofarrivalswithinadayand Oneofourpurchasesdidnotdeliveranyappreciablevol- thearrivaldistributionovertheentirecampaign. umeoftraffic. Thereasonforthefailureoftrafficdeliveryis Thedailyarrivalpatternsofvisitorsshowedsomeunusual notclear. Theprovidermayhavedecidednottodeliverdue artifacts. AeTraffic delivered consistently though the entire to the instrumentation of the destination site. The provider dayascanbeseeninFigure 4. Itiswellknownthattypical still collected payment for the traffic which was not deliv- user traffic follows a diurnal cycle, reaching the high peak ered. See Tables 5 and 6 for a summary of our measure- duringthedayandlowpeakovernightwhenusersaresleep- ments. Ofthetargetof110,000visitsthatwepurchased,we ing. AmoreobviousexampleofartificialdeliveryisBuild- received 69,567. At the time of writing AeTraffic was still Traffic, which delivered only during the first 10 minutes of deliveringvisitorsbeyondthecampaignend.TheBuildTraf- thehour,ascanbeseeninFigure 5. ficpurchasestoppeddeliveringvisitorsabruptlyattheendof The arrival of users throughout the campaign was quite January,28daysintothe60-daycampaign. burstyinsomecases.Withperiodsofhighdeliveryfollowed Weanalyzedtrafficdeliveredtoourhoneypotwebsitesfor byperiodsoflowdelivery. MaxVisistsdeliveredtrafficpri- avarietyofcharacteristics. Beforeprocessing,thedatawas marilyinthefirsthalfofeverydayascanbeseeninFigure filteredtoremoveanyeventsoriginatingfromourhoneypot 6. Meanwhile, TrafficMasters delivery primarily consisted server’s IP address. Also any user agent containing case- of two large spikes with little delivery between, as can be insensitive’bot’wasexcluded. Thiswasdonetoremovethe seeninFigure 7. effects of web crawler traffic from our results. All of the traffic observed appeared to originate from our purchases. 4.2.4 IncompleteLoads Wedidnotseeanyindicationsofnaturaltraffic. EverypageonourhoneypotsitescontainedfourJavaScript 4.2.1 BlacklistComparison blocks which loaded advertising creatives. Each creative The IP addresses of the purchased traffic showed some wasindependentlyinstrumentedtoreportwhenithadbeen overlapwithpublicIPblacklists. Everymorningat7GMT loaded. FourblocksofJavaScriptneedtocompleteinorder IPblocklistswerepulledfromDShield.org[3]andUcePro- to successfully load all of the ads on the pages. Using this tect[10]aspointsofcomparison.ThecountofblacklistedIP information, we can calculate the percentage of page loads 7 USENIX Association 22nd USENIX Security Symposium 217 Table5: Purchasedtrafficdelivery. Vendor ExpectedVisitors DeliveredVisitors ExpectedDuration ActualDuration %Loadingall4Ads AeTraffic 10,000 17,205 7days 8days1 16.40 BuildTraffic 20,000 1,086 60days 29days 60.75 BuyBulkVisitor 20,000 1 5days 1day Unknown2 MaxVisits 10,000 9,635 5days 5days 12.80 TrafficMasters 50,000 41,640 2days 3days 58.34 1Stillsendingtrafficatthetimeofsubmission 2UserfailedtoloadJavaScript 140 400 350 utes 120 Hour 300 s per 5 Min 1 8000 ssions per 122505000 n e sio 60 mpr 100 es I 50 pr m 40 0 I 1 1 1 1 1 1 1 1 1- 1- 1- 1- 1- 1- 1- 1- 1 2 2 2 2 2 2 2 20 9 0 1 2 3 4 5 6 0 0 0 0 0 0 0 0 00:002:004:006:008:010:012:014:016:018:020:022:000:0 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0 0 0 0 0 0 0 0 0 0 0 0 0 Time(UTC) Time(UTC) Figure4: TrafficdistributionfromAeTraffic. Figure6: TrafficdistributionfromMaxVisits. 60 12000 utes 50 Hour 10000 s per 5 Min 3400 ssions per 468000000000 n e ssio 20 mpr 2000 e I pr m 10 0 I 0 0 0 0 0 0 0 0 0 0 0 0 2-2-2-2-2-2-2-2-2-2-2-2- 1 1 1 1 1 2 2 2 2 2 2 2 0 8 9 9 9 9 0 0 0 0 1 1 1 1 0 0 1 1 0 0 1 1 0 0 1 00:002:004:006:008:010:012:014:016:018:020:022:000:0 8:000:006:002:008:000:006:002:008:000:006:002:00 0 0 0 0 0 0 0 0 0 0 0 0 0 Time(UTC) Time(UTC) Figure5: TrafficdistributionfromBuildTraffic. Figure7: TrafficDistributionfromTrafficMasters. thatcompletedforallfourads. lapofIPaddressforthecampaigns:AeTrafficreused0.75% Traffic from BuildTraffic and TrafficMasters resulted in of IP addresses, BuildTraffic reused 0.64% and MaxVisits adscompletelyloadingapproximately60%ofthetime. Ae- reused 11.25%. The larger purchase from TrafficMasters Traffic and MaxVisits only loaded approximately 15% of showed significantly more IP address overlap with 65% of the time. Reasons for failure to load all the ads include: IPsreused. ThemajorityoftheIPsgeolocatedinsideofthe JavaScript blockers, JavaScript errors, JavaScript execution US,withtheexceptionoftheBuildTrafficIPs. timeout,andnavigationawayfromthepage. 4.2.6 UserAgents 4.2.5 IPAddressDistribution The number of unique user agents across the purchases IP addresses from an entire traffic generation campaign shows the traffic came from a diverse set of browsers. An where checked for duplicates to get an idea of the distri- alternative explanation could be that artificial traffic gener- bution of traffic sources. According to the advertised 24- ators utilized a large set of User Agent strings. However, hour-unique policy an IP address could be used once per combinedwiththediversesetofIPaddresses,itappearsthe day. For small purchases our data showed very little over- trafficcouldwellbegeneratedfromgenuineviewers. 8 218 22nd USENIX Security Symposium USENIX Association Table6: Purchasedtrafficcharacteristics. Vendor IPSources %USIPs %Blacklisted UniqueUserAgents %MobileUserAgents %ZeroSized AeTraffic 17,075 93.14 .99 3,331 5.44 34.08 BuildTraffic 1,079 .17 1.75 312 14.42 NA1 BuyBulkVisitor 1 100 0 Unknown2 0 Unknown2 MaxVisits 8,551 98.83 .47 1,883 4.03 NA1 TrafficMasters 14,489 99.29 1.65 3,096 4.59 47.34 1Detailedtrackingnotimplementedattimeofpurchase 2UserfailedtoloadJavaScript About 5% of the traffic from AeTraffic, MaxVisists and licitedwindows. PPVnetworksneedtocircumventthisre- TrafficMastershadtheUserAgentsignatureofamobilede- striction. One option is the PPV code can explicitly by- vice. BuildTraffic traffic contained a much higher percent- passbrowserprotections. Areviewoftheissuetrackersfor age of mobile device User Agents. Possibly due to the in- Chrome or Firefox does not list many bugs related to the creasedgeographicdiversityofthetraffic. browsers’pop-upblockers,thusthisislikelytobeadifficult codingchallenge. OurempiricaldatadidnotshowanyPPV 4.2.7 ViewportSize network tags that attempt to bypass the pop-up blocker di- Halfwaythroughourpurchasesweinstrumentedthecode rectly. Thecommonapproachistotiepop-undercreationto torecordtheelementheightandwidth.3 Overall46.51%of auseractionsincebrowserstypicallyallowcreationofnew adviewshadaheightorwidthof0,meaningthattheadver- windowsontheseevents. Typicallythepop-underactionis tisementcouldnotpossiblybeviewedbytheuser. 13.42% attached to the onclick event of the body of the page. This ofviewshadbothaheightandwidthofzero. Theseresults causesthepop-underactiontofirewhenevertheuserclicks corroboratetheBuildTrafficdeliverytechniqueofzero-sized anywhereonthesite. framesdescribedin4.3.1. After creation, the pop-under window is directed to load a specific URL pointing to the network’s ad server. The 4.3 Pay-Per-ViewNetworks adserver URLcontainsanumber ofparametersdescribing By examining the JavaScript provided by traffic genera- targeting and attribution of the visitor. The parameters al- tion services and the referer fields from traffic on our hon- waysincludeanidentifierfortheoriginatingsitesothatthe eypotsites,wewereabletoidentifythefactthattrafficwas publisher can get paid for the traffic. The list of parame- generatedprimarilyfrompop-underwindows. Interestingly, tersisclearlydependentonindividualimplementations,but while we did see evidence of traffic from expired domains, someofthemorecommontargetingparametersare:(i)user- wesawnoevidenceoftrafficfrombotnets.Thisobservation Token, (ii) indication if adult sites are allowed, (iii) user ledtoourdeeperinvestigationoftheuseofpop-undersfor IP/geolocation,and(iv)viewportsize. Usingtheseparame- trafficgenerationandourcharacterizationofPPVnetworks. terstheadserverselectsandreturnsthemostprofitable3rd- As noted above when publishers participate in a traffic partywebsites(i.e.,thepublishersthathavepurchasedtraf- generationservicei.e.,aPPVnetwork,theyaregivenablock fic) available. This is presumably the point where the 24 of JavaScript to place on their site, which looks very much houruniqueuserguaranteeisenforced. likeastandardadtag. InthecaseofPPVnetworks,whena Manually loading a publisher’s PPV network tag often useraccessesaPPVnetworkpublisherpage,theJavaScript showed multiple redirections through a network of PPV opens a new window (typically behind the active browser servers. This mimics what is seen in standard advertis- window,henceapop-under)andloadsthePPVserverURL. ing networks where an individual ad can be redirected Thepublisherrunningthetaggetsashareoftherevenuefor across many networks in order to optimize the return from everyPPVURLthatissubsequentlyloaded. ThePPVnet- each user. For example, repeatedly loading the InfinityAds worksolvestwoproblemswithrespecttomarshalingusers: publisher tag showed network connections being made to (i) it delivers the JavaScript which creates the pop-under ads.lzjl.com, windowand(ii)itdeterminesthesitetodisplayinthewin- cpxcenter.com, and 199.21.148.39. Whois and reverse IP dow. lookups on these all indicate YesUp eCommerce Solutions Inresponsetoprevalentpop-upadvertising,webbrowsers Inc.forthecontactinformation. YesUpislocatedinOntario give users the option to prevent pages from opening unso- CanadaandhasahostofeCommerceofferings. Ideally we would have identified the referer to the main 3Using document.documentElement.clientHeight, docu- pop-under page in our purchased traffic. This would en- ment.body.clientHeight, window.innerHeight depending on browsertype. able us to identify the sites hosting pop-under tags. Un- 9 USENIX Association 22nd USENIX Security Symposium 219

Description:
2While Google is not the only company in this domain, we refer to them as an The idea is to block advertising on websites that . simple HREF strings that typically reference JavaScript code. Figure 2: .. Adsense Safe. Safe to use
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.