Table Of ContentA Concise Network-Centric Survey of IP Traceback
Schemes based on Probabilistic Packet Marking
Matthias R. Brust Ankunda R. Kiremire
Center for Research in Cyber Security Center for Secure Cyberspace
Singapore University of Technology and Design, Singapore Louisiana Tech University, USA
matthias_brust@sutd.edu.sg ark010@latech.edu
6
1 Abstract—Multipleprobabilisticpacketmarking(PPM)schemes grouped into two orthogonaldimensions: packet marking and
0 for IP traceback have been proposed to deal with Distributed packet logging. The main idea behind packet marking is to
2 Denial of Service (DDoS) attacks by reconstructing their attack
recordnetwork path informationin packets. In mark based IP
graphs and identifying the attack sources.
b traceback, routers write their identification information (e.g.,
In this paper, ten PPM-based IP traceback schemes are
e comparedandanalyzedintermsoffeaturessuchasconvergence IP addresses) into a header field of forwarded packets. The
F
time,performanceevaluation,underlyingtopologies,incremental destination node then retrieves the marking information from
4 deployment, re-marking, and upstream graph. Our analysis the received packets and determinesthe network path. Due to
shows that the considered schemes exhibit a significant discrep-
thelimitedspaceofthemarkingfield,routersprobabilistically
] ancy in performance as well as performance assessment. We
I decide to mark packets so that each marked packet carries
concisely demonstrate this by providing a table showing that
N
(a)differentmetricsareusedformanyschemestomeasuretheir only partial path information. The network path can be con-
s. performance and, (b) most schemes are evaluated on different structedbycombiningthemarkinginformationcollectedfrom
c classes of underlying network topologies. a number of received packets. This approach is also known
[ Our results reveal that both the value and arrangement of
as probabilistic packet marking (PPM) [3]. PPM incurs little
the PPM-based scheme convergence times vary depending on
2 overhead at routers. However, it requires a flow of marked
exactly the underlying network topology. As a result, this paper
v showsthataside-by-sidecomparisonoftheschemeperformance packets to construct the network path toward their origin.
1 acomplicatedandturnsouttobeacrucialopenprobleminthis Inthis paper,we presenta concisenetwork-centricanalysis
1
research area. of a selected set of PPM-based schemes. Ten PPM-based
0
8 Index Terms—IP traceback, network security, distributed de- schemes are compared in terms of features such as conver-
0 nial of service (DDoS), probabilistic packet marking (PPM) gence time, the metrics, underlying topologies, incremental
. deployment, re-marking, and upstream graph. These schemes
1
0 I. INTRODUCTION are PPM [3], AMS [4], PPM-NPC [11], TMS [12], FIT [6],
6 Internet Protocol (IP) traceback is a method to deal with RPPM [7], TPM [8], Randomize-and-link [9], IDPPM [10],
1 Distributed Denial of Service (DDoS) attacks [1], [2]. Using and PBS [13]. The schemes considered therein are by no
:
v IP traceback, sources of attack traffic can be identified from means an exhaustive study of all the PPM-based schemes in
i the network traffic they generate. A prominent IP traceback existence.However,the collectionof schemesis largeenough
X
technique for identification of flooding style DDoS attacks to show the discrepancy in both the metrics and underlying
r
a is Probabilistic Packet Marking (PPM). In PPM-based IP topologies as well as the inadequacy of the topologies that
traceback, network routers embed their own identities in make their direct comparison difficult [14], [15].
packets randomly selected from all the network traffic that Thus,thispapershowsthatthedirectperformancecompari-
the routers process [3]. In the event of an attack, the routers’ sonoftheschemesiscomplicatedornotfeasibleatall.Asour
identity markings present in the attack packets can be used analysis show the reasons for this is (a) many schemesutilize
to reconstruct the attack graph — the paths taken by attack different metrics to measure their performance and, (b) the
traffic — and establish its sources [4], [5]. The technique many schemes are simulated on different kinds of underlying
of probabilistically marking packets for IP traceback is the network topologies, the majority of which do not provide an
basis of many schemes hereafter referred to as PPM-based adequate abstraction of the topology of the Internet or focus
schemes [4], [6], [7], [8]. Multiple additional schemes have on different characteristics of it [7], [11], [12].
been proposed with the purpose to increase the efficiency of In detail, the underlying topologies are typically tree-
PPM-based schemes [6], [8], [9], [10]. structuredwithasinglepathfromanattackertothevictim.In
Tracing the paths of IP packets back to their origin, known contrast,thetopologyoftheInternetexhibitsalternativeroutes
asIPtraceback,isanimportantstepindefendingagainstDoS that make it both resilient and scalable. Both the disparity in
attacksemployingIPspoofing.IPtracebackfacilitatesholding metricsandunderlyingtopologies,aswellastheinadequacyof
attackersaccountableandimprovingtheefficacyofmitigation thetopologies,raisequestionsabouttheperformancesofthese
measures. The existing approaches for IP traceback can be schemes in the Internet and the comparability between them.
Degree-based models
scheme ensures that every router embeds its own identity in
packetsrandomlyselectedfromthepacketstheroutersprocess
during routing. Since a large number of packets is received
in an attack, there is a considerable chance that a victim
Structural models
will have received packets with markings from all the routers
emphasis on the that were traversed by the attack packets. The victim then
degree distribution of
the internet employsthe reconstructionprocedurewhich usesthe received
the Internet
marked attack packets to map out the attack graph — the
emphasis on the paths from the victim to the attackers. The total number of
hierarchical structure
received packets required to trace the attackers is referred to
of the internet
as the scheme’s convergence time.
node edge node Multiple PPM-based schemes have since been introduced.
Spatial models emphasis on the network layer One example is the Tabu Marking Scheme (TMS) [12]. The
spatial constraints of data link layer author points out that PPM is prone to information loss as
the internet physical layer
a result of re-marking. Re-marking occurs when a router ran-
domlyselectsapacketwhichalreadyhasmarkinginformation
from an upstream router, and consequently overwrites this
information. TMS tackles this problem by ensuring that their
Figure 1. Internet topology can be captured by a variety of models which
marking scheme forfeits the marking opportunityin the event
include spatial, structural and degree-based models. Eachmodel emphasizes
different properties of the Internet and can be used to evaluate network that the randomly selected packet contains previous marking
protocols when employed in the Internet. The nodes in these topologies information. As a result, they report lower convergencetimes
represent devices operating at theInternet layer ofthe TCP/IP modelorthe
than PPM for DDoS attacks.
networklayeroftheOSImodel(e.g.routers,switches,hosts).Twonodesare
connected by an edge if Internet traffic can be directly transmitted between Another example is the Prediction Based Scheme (PBS)
themwithoutbeingforwarded byanyintermediate nodes. which also avoids re-marking [13]. However, in contrast
to TMS, the PBS marking scheme ensures that the router
information is embedded in the next available packet if the
This shows that there is a need to evaluate and compare the
randomly selected packet already has marking information.
schemesoncommonandappropriatenetworks.Theresultsof
The PBS marking scheme requires extra space cost of 1 bit
this evaluation can then be used to determine which schemes
compared to PPM. Additionally, the reconstruction procedure
are the most promising candidates for Internet deployment.
utilizes both legitimate and attack traffic to reconstruct the
In summary, the contribution of this paper is threefold: (a)
attack graph.
ananalysisofPPM-basedschemesis presented,(b)providing
Wongetal.[7]presentRectifiedProbabilisticPacketMark-
a taxonomy for PPM-based IP traceback schemes, and (c) an
ing (RPPM). Theypointout thatthe reconstructionprocedure
analytical model to argue that different underlying topologies
used in PPM [3] and Advanced and Authenticated Marking
inhibit a direct comparison between the performance of the
schemes (AMS) [4] has an imprecise termination condition.
PPM-based schemes.
They present a precise termination condition that enables
The remainderof this paperis arrangedas follows. Related
complete attack graph reconstruction within a user-specified
work, in which the selected set of PPM-based protocols is
level of confidence.
discussed, is presented in Section II. The analysis of PPM-
Multiple additional schemes have been proposed with the
based schemes as well as additional background is presented
purpose to increase the efficiency of PPM [6], [8], [9], [10].
inSectionIII,includingconsiderationsondefinitions,metrics,
Some of these schemes are analyzed in Table I. The table
and underlying topologies. The theory and system model be-
compares ten PPM-based schemes in terms of features such
hindourworkisanalyzedinSectionIV.Wediscussoutcomes
as convergence time, the metrics, underlying topologies, in-
of this paper and conclude in Section V.
cremental deployment, re-marking, and upstream graph. The
schemes considered therein are by no means an exhaustive
II. RELATEDWORK
study of all the PPM-based schemes, but the selection of
Inthissection,we describeanddiscussa setofPPM-based schemes is large enough to show the discrepancy in both the
IP traceback schemes for which we will provide an analysis metrics and underlying topologies as well as the inadequacy
in Section III. of the topologies that make their direct comparison difficult.
PPM-based schemes consist of a marking scheme and a The feature incremental deployment refers to whether the
reconstruction procedure, and are based on the assumption schemewouldbesuccessfulifthemarkingschemeisdeployed
that large amounts of traffic are used in a (D)DoS attack onafractionoftheroutersinthenetwork.Onlyafewschemes
[3]. In their original work, Savage et al. [3] propose that explicitly state that they would be successful when partially
the PPM marking scheme is employed at all times in all the deployed [3], [4], [6], [9].
routers in the network, while the reconstruction procedure is Re-markingreferstowhetherthemarkingschemeatarouter
employedbythevictimintheeventofanattack.Themarking permitstheoverwritingofpreviousedgeorrouterinformation
in a packet. The majority of the considered schemes permit incorrectrecombinationwherefragmentsfromroutersinG
act
re-marking of packets [3], [4], [6], [7], [8], [9], [10]. are combinedto identify an innocentrouter in G′ , whereby
act
Upstream graph refers to whether a scheme requires a G′ refers to the complement of G, as being part of G .
ret
previously obtained map of the network to successfully trace Ideally, the returned attack graph should be identical to the
the specific path taken by attack traffic. Some of the works actual attack graph, i.e. G =G .
ret act
addresshowsuchamapcanbeobtainedtoaidinattackgraph
Definition 4 (False positives). False positives are nodes in-
reconstruction [4], [6], [13].
correctly identified as belonging to the attack graph. Given
Other features such as metrics, convergence time, and un-
an attack graph G and a reconstructed attack graph G ,
derlying topology are discussed in subsequent sections. act ret
the false positives are identified as all members of the set
It is important to point out that PPM-based schemes are
{G ∩G′ }.
not the only proposed approaches to IP traceback [1], [2]. ret act
Alternatives include packet logging [16], specialized routing Definition 5 (False negatives). False negatives are nodes
[17], Internet control message protocol (ICMP) traceback in the attack graph that are not identified as such by the
[18],deterministicpacketmarking[19]andhybridapproaches reconstruction procedure. Given an attack graph G and
act
which combine different traceback techniques [20], or com- a reconstructed attack graph G , the false negatives are
ret
bine traceback with anomaly detection [21]. identified as all members of the set {G ∩G′ }.
act ret
III. PROTOCOL ANALYSIS The expressions for convergence time shown in Table I
are the analytical bounds for C = f(d,p) given G is the
In this section, an analysis of the discussed PPM-based
Single Path, Single attacker (SP/SA) topology and the recon-
schemes from Section II is provided. This section is catego-
struction procedure has no prior knowledge of the underlying
rized according to definitions and expressions,
topology. The SP/SA topology occurs when all the attack
A. Definitions and expressions traffic originates from a single source and takes the same
path to the victim. The convergence time expressions shown
The performance of traceback schemes is commonly mea-
in Table I are derived using the coupon collector’s problem
sured by the convergencetime, also known as the reconstruc-
[22]. In randomize-and-link [9], l is the number of fragments
tiontime.Theconvergencetimeisdefinedasthetotalnumber
that each router splits its address into, and H is the nlth
of received packets required by the reconstruction procedure nl
harmonic number. The schemes in [4], [6], [8], [10] do not
to returnthecompleteattack graph[3]. The convergencetime
provide mathematical modeling of their convergencetimes in
isdependentonthemarkedpacketdistributionP, the number
the SP/SA topology.
of attackers n, and attack graph G as below.
Savage et al. [3] and Tseng et al. [11] point out that the
Definition 1 (Convergencetime). The convergence time C is ideal marking probabilityp dependson the attack path length
defined as C = f(Pi,n,G) where G(v,e) is a graph made d.InPPM[3],themarkingprobabilityisp=0.04.Thisvalue
up of a set v of nodes and a set e of edges on which M is ofpisbasedonthereportthatthepathlengthd≤25formost
implemented. Internetpaths[23]. Thisvalueofp ismaintainedinAMS[4],
TMS [12], RPPM [7], and PBS [13].
Definition 2 (Attack path length). The attack path length d
Wongetal. [7]presenta reconstructionprocedureinwhich
is the number of edges (hops) that the attack traffic traverses
they count the number of packets required to construct a
between an attacker A and the victim V.
fully connected attack graph, and then count extra packets to
Definition3(Markedpacketdistribution). Themarkedpacket guarantee with a certain level of confidence that all edges in
distributionPi isdescribedasthemarkedpacketsreceivedby theattackgraphareinfactaccountedfor.Toensureuniformity
the victim from the ith attacker Ai given i ∈ [1,n] where n across the schemes considered in our simulations, we use a
is the number of attackers. We define Pi = f(p,d,M) as it simplified version of their reconstruction procedure in which
depends on the marking probability p, the marking scheme the numberof packetsrequiredto accountfor all edgesin the
M, and attack path length d. complete attack path is counted.
Alargevarietyofgraphsareusedastheunderlyingtopolo-
An alternative measure of the performance of a traceback
gies in the considered protocol performance analysis. The
scheme is the correctness of the attack graph G that the
ret
majority of uses tree-structured graphs G = {v,e;|v| =
reconstructionprocedurereturnswhen comparedto the actual tree
attack graph G . |e|+1} [3], [7], [11], [12], [9], [13].
act
This is especially important in marking schemes where the
B. Metrics
router IP address is broken into fragments because of space
constraints in the packet header as in [3], [4], [6], [8]. In In this section, we describe the metrics used in the con-
these works, a fragment of the router’s IP address, instead of sidered PPM-based schemes (as shown in Tables I and II) in
the entire address, is transmitted in the packet header, which terms of the definitions given in the preceding section.
requires an extra function of combining the fragments during 1) Convergence time metrics: The convergence time C is
thereconstructionprocedure.Thisintroducesthepossibilityof the fundamental measure of the speed and success of any
TableI
ACOMPREHENSIVECOMPARISONOFPPM-BASEDIPTRACEBACKSCHEMESSHOWINGTHEVARIETYOFMETRICSEMPLOYED,ANDTHETOPOLOGIES
USEDFORTHEEVALUATIONOFTHOSESCHEMES.THETABLEPROVIDESACOMPARISONOF10DIFFERENTPPM-BASEDSCHEMESOVERTHEIR
FEATURES.THESEFEATURESINCLUDEconvergence time,metricsUSEDFOREVALUATIONOFTHESCHEMES,WHETHERTHEYREQUIREPRIOR
KNOWLEDGEOFTHEupstreamgraphTOCORRECTLYIDENTIFYATTACKERS,WHETHERTHEYCANBEincrementally deployed,ANDunderlying topology.
THECONVERGENCETIMEEXPRESSIONSPRESENTEDINTHETABLEAREFORADOSSCENARIOASSUMINGNOPRIORKNOWLEDGEOFTHENETWORK
TOPOLOGYWHILETHEUNDERLYINGTOPOLOGYSHOWSTHENETWORKTOPOLOGIESUSEDINTHEEVALUATIONOFTHESCHEMES.THESETOPOLOGIES
INCLUDEsinglepath,singleattacker(SP/SA),singlepath,multiple attacker(SP/MA),ANDmultiple paths,multiple attacker(MP/MA).THEMETRICS
SHOWNINTHISTABLEAREDESCRIBEDINTABLEII
Incremental Upstream
Scheme Year Convergence time Metrics Re-marking Underlyingtopology
deployment graph
PPM[3] 2001 ≤ p(1l−np(d)d)−1 1 yes yes no SP/SA(max.30hops)
Traceroute data set (103402 destina-
AMS[4] 2001 undetermined 1,4,5 yes yes yes
tions,2000attackers)
PPM-NPC[11] 2004 ≤ ln(d)+0.58 7 no no no SP/SA(10hops)
p
TMS[12] 2005 ≤ p(1l−np(d)d)−1 1,2 no no yes Binarytree(6hops,32sources)
Skittermap(174409hosts,5000attack-
FIT[6] 2005 undetermined 1,3,6 yes yes yes
ers)
RPPM[7] 2008 < ln(d) 5,10,11,12 no yes no SP/SA, binary tree, random tree net-
p(1−p)d−1 work(15,100,500,1000nodes)
TPM[8] 2008 undetermined 1,4,8,9 no yes yes Skitterdata(avg.18hops)
Randomize-and-link 2008 < nlHnl 7,11,12 yes yes no Binarytree(10hops)
[9] p(1−p)d−1
IDPPM[10] 2010 undetermined 12,13 no yes no SP/SA(20-32hops)
PBS[13] 2012 ≤ ln(d) 1,2,7 no no yes/no SP/SA,SP/MA,MP/MA,50nodenet-
p work,100nodenetwork
PPM-based scheme and it is dependent on many factors. A the number of nodes |v| in the attack graph.
variety of metrics has been used to measure this dependence 2) False positive and false negative metrics: Some re-
with respect to these various factors. searchers have investigated the various factors that affect the
Convergence time versus attack path length shows how C accuracy of the traceback process.
varies with d. This is normally done for a simple topology False positives versus false negatives is a measure of how
Gtree, with n = 1 but is done for larger values of n in [4], |{Gret ∩ G′act}| varies with |{Gact ∩ G′ret}|. This form of
[12], [7], [8], [13]. a Receiver Operating Curve (ROC) is used to evaluate the
Convergence time versus number of attackers is a measure scheme in [6].
of how C varies with n for a specified G. This metric is False positive versus number of attackers investigates the
considered in [7], [9], [13], [11]. variation of the number of false positives |{G ∩ G′ }|,
ret act
Convergence time versus marking probability is used by with n for G=Gtree. This is considered in [4], [8].
Wong et al. [7] to investigate the relationship between C and Falsenegativeversusnumberofpacketsshowshowthefalse
marking probability p for a topology Gtree, where n = 1, negatives |{Gact ∩G′ret}| vary with the number of received
d = 3 and show that large p values actually increase the packets without any false positives. This plot is used in [6]
convergencetimewhileverylowpvaluesresultinlargevalues with n = 100 to show that the accuracy improves with an
of P′. This relationship is investigated by Goodrich [9] for increase in received packets.
i
G , n = 1000, d = 10. He finds that the ideal marking 3) Additional metrics: There are additional metrics that
tree
probability is p ≈ 1. do not fall into the convergence time, or false positive and
ideal d
Convergence time versus number of routers is a descrip- false negative categories. These metrics are generally used
tion of the relationship between convergence time C and to understand the extra dynamics involved in the traceback
the size |v| of a given graph G. This metric is used in process.
[9] for |v| ∈ {50,100,250,500,1000}, in [7] for |v| ∈ Fraction of distinct markings versus number of attackers
{15,50,100,500,1000},and in [10] for |v|∈[3,30]. measures the proportion of P with distinct markings as a
i
Reconstruction time in seconds versus the number of at- function of n at a given value of d. This metric is used in
tackers is used in [4] to measure the variation of C with n. [8].
However, C is translated from packets to time in seconds. Number of marked packets versus distance from victim is a
As a result, this metric is dependent on the system on which description of the distribution of P . It shows how P varies
i i
the victim is running the reconstruction procedure, as well withrouterdistancel,wherel∈[0,d]. Thiswasevaluatedfor
as the bandwidth and latency of the topology on which the G withn∈{1,32}in[12],andwithn∈{1,3,5}in[13].
tree
simulation is being carried out. In [7], the authors investigate Probability that a packet is unmarked versus path length
the relationship between reconstruction time in seconds and is used in [8] to show the variation of P′ with d for M ∈
i
TableII
[13] to represent the non-re-marking category. The analytical
THISTABLESHOWSMETRICSUSEDINTHECONSIDEREDPPM-BASED
SCHEMESANDTHEIRINDICESFROMTABLEI. models for these three schemes are markedly different from
each other,evenforequalmarkingprobability,because ofthe
Index Metric definition differences in the schemes’ underlying marking algorithms.
1 Convergence timevs.attackpathlength The performance of any PPM-based scheme can therefore be
comparedto either oneof these schemes, or a combinationof
2 Numberofmarkedpacketsvs.distancefromvictim
them.
3 Falsepositivesvs.falsenegatives
Becauseofre-markinginPPM,thevictimtypicallyreceives
4 Falsepositivesvs.numberofattackers
moremarkingsfromclose-byroutersthanfromdistantrouters.
5 Reconstruction timeinsecondsvs.numberofattackers
Thechanceofreceivinga markedpacketfroma routerl hops
6 Falsenegatives vs.numberofpackets
away is given by the geometric distribution expression p(1−
7 Convergence timevs.numberofattackers p)l−1. Thisisbecauseareceivedmarkedpacketindicatesthat
8 Fractionofdistinctmarkingsvs.numberofattackers that packet was selected by a router (with probability p), and
9 Probabilityapacketisunmarkedvs.pathlength not selected (with probability 1−p) by all l−1 subsequent
10 Successfulratevs.traceback confidencelevel routers.The analysis for PPM can thereforebe applied to any
11 Convergence timevs.markingprobability schemewherethemarkingsfromdistantroutersarerarerthan
markings from close-by routers.
12 Convergence timevs.numberofrouters
InTMS,thedecisiontoforfeitamarkingopportunityifthe
13 Overheadonroutersvs.numberofnodes
packetispreviouslymarkedmeansthatmarkingsfromrouters
distantfromthevictimaremoreprevalentthanmarkingsfrom
{PPM,TPM}. closerrouters.Thechanceofreceivingamarkedpacketfroma
Successful rate versus traceback confidence level is used routerlhopsawayisgivenbyp(1−p)d−lwheredistheattack
by Wong et al. [7]. They present a reconstruction procedure pathlength.Thisisbecauseareceivedmarkedpacketindicates
that uses a traceback confidence level to determine when to that that packet was selected by a router (with probability p),
haltthetracebackprocess.Theypresentthismetric,whichisa after not being selected (with probability 1−p) by all d−l
measureofhowtheproportionoftimes{G =G |G = previousrouters.Thisanalysiscanbeappliedtoallschemesin
ret act act
G ,n = 1,d = 3} varies with the traceback confidence which markings from distant routers are more prevalent than
tree
level for p∈{0.1,0.5,0.9}. markings from close-by routers.
Overheadonroutersversusnumberofnodesisusedin[10] In contrast to TMS, the PBS marking scheme compensates
to show the overhead cost on the routers as a function of the for the missed marking opportunities. Therefore the chance
number of nodes |v| in the attack graph G. The overhead on of receiving a marking from a router l hops from the victim
the routers is measured in terms of the number of markings is given by p for any router in the path. This analysis can
performed. be applied to all schemes in which the markings from the
Insummary,TablesIandIIexposethelargenumberofmet- routers are equally prevalent regardlessof their distance from
ricsthatareusedtomeasuretheperformanceofthementioned the victim.
PPM-based schemes. Coupledwith the differingsettings used These three schemes therefore provide an adequate basis
totestthesemetrics,thismakesdirectcomparisonofdifferent to understand the impact of the network topology on other
schemes problematic. PPM-based schemes.
C. Categories of PPM schemes D. Underlying topologies
Despite their large number, PPM-based schemes have sim- The performanceof a PPM-based tracebackscheme should
ilar underlying algorithms in their marking schemes. The be evaluated on the Internet itself. However, because of the
underlying algorithm is responsible for how the packets in very large-scale dynamic and heterogeneous structure of the
which the router identities are embedded are selected. For Internetattemptstocarryoutempiricalprotocolevaluationare
example, the majority of the considered schemes exhibit expensive and inflexible [24]. As a result, researchers resort
underlying algorithms in which all routers randomly select to simulations implemented on underlying topologies which
packets with equal probability p [3], [4], [6], [7], [8], [9], are considered to be simplified abstractions of the topology
[10].Theschemesinthiscategoryarepronetore-marking.We of the Internet [24], [25], [26], [27]. An underlying topology
refertothiscategoryasthere-markingcategoryofPPM-based is represented by a graph G(v,e) consisting of nodes v and
schemes.Intheothercategoryofschemes,therouters’packet edgese where the nodesrepresenteither devices with routing
selection process is only partially random. The underlying capability or end hosts. An edge between any two nodes
algorithmsinthiscategoryprohibittheoverwritingofprevious means that traffic can be directly transmitted between those
routerinformationandasaresultexhibitperformancesthatare two devices (cf. Fig. 1) [24].
notablydifferentfromthere-markingcategory[12],[13],[11]. As shown in Table I, a variety of underlying topologies
We select three representative marking schemes: PPM [3] have been used to evaluate the performance of PPM-based
to represent the re-marking category, and TMS [12] and PBS schemes.Duringsetup,themarkingalgorithmisimplemented
in the nodes (routers) in the graph. To simulate the attack, a tree-structured topology. However, because tree-structured
packets are transmitted from one or more nodes (representing topologies do not exhibit the alternative routes ubiquitous in
theattackers)toonespecificnode(representingthevictim).A theInternet,theycannotbeusedasanappropriateabstraction
reconstructionprocedureis then implementedat the victim to of the Internet topology.
mapouttheattackgraphG .Theattackgraphconsistsofall In summary, network models used to simulate the Internet
act
nodesandedgesin the underlyingtopologythatwere directly topologyfallinto threecategoriesbasedon thepropertiesthat
involved in transmitting the attack packets. The underlying theyemphasize,namelydegree-basedmodels,structuralmod-
topologies used in PPM-based schemes range from simplistic els and spatial models (cf. Fig. 1). The emphasis of degree-
to complex. based models is the degree distribution of the nodes in an
The single path, single attacker (SP/SA) is a simple topol- attempt to recreate the power law observationsin the Internet
ogyconsistingofasingleattackernodesendingpacketsalong [26], [29]. The structural models arrange the nodes to mimic
a single identical path to a single victim node. The length of the hierarchical structure of the Internet, with Internet traffic
the path varies with each work ranging from 3 hops to 32 being transmitted through routers located within autonomous
hops [3], [7], [13], [11]. This setup is used to simulate the systems [24], [27]. The spatial models place emphasis on the
performance of PPM schemes during a flooding style DoS locationofthenodeswithanytwonodesbeingconnectedonly
attack. if they are within a transmission range of each other [30].
The Single Path, Multiple Attacker (SP/MA), and Multiple
Paths, Multiple Attacker (MP/MA) topologies consist of mul- IV. TOPOLOGY INFLUENCEANALYSIS:A
tiple sources of attack traffic to simulate a DDoS attack. The DEMONSTRATION
SP/MA simulatesa uniquetopologyin whichallthe attackers In the last part, we argued that the comparison of perfor-
are locatedat differentdistancesfromthe victim butall along mance results is difficult because the underlying topologies
a single identical path [13]. The MP/MA simulates a more used are fundamentally different. Here, an analytical basis
generaltopologywhereeachattackerhasauniquepathlinking is given that shows how subtle differences in the network
it to the victim node. In some cases, the paths are completely topologies contribute to differences between the convergence
independent [7], while in other cases, the paths merge closer times of the PPM-based schemes.
to the victim [12], [7], [9], [13]. The convergence time for PPM-based schemes has been
OneuniqueMP/MAtopologyisatree,e.g.abinarytree.In originallymodeledasthecoupon-collector’sproblem[22],[3].
this case, the attack graph is modeled as a tree with some or A coupon collector seeks to collect d equally likely distinct
all the leaves at a certain depth representing the attack nodes, coupons by drawing them from an urn with replacement.
andthe rootof the tree representingthe victimnode[12], [7], Whileittakesashorttimetogetthefirstfewuniquecoupons,
[9]. This setup ensures that different attack paths merge the it takes considerably longer to get the last few coupons that
closertheyaretothevictim.AswithSP/SAandSP/MA,there complete the entire collection. The expected number of turns
is only one path in the attack graph from an attack node to needed to draw all d distinct coupons grows as Θ(d·ln(d))
the victim node. [22].
Schemes such as AMS, FIT, and TPM have been evaluated When the coupon collector problem is applied to packet
using actual data sets from the Internet [4], [6], [8] including marking,themarkedpacketsareconsideredtobethecoupons.
traceroute data sets from Lucent Bell labs [4] and CAIDA’s For example, Fig. 2(a) shows a single path linking attacker A
skitter map [6], [8]. These data sets are used to produce to victim I and the target of the “coupon collector” would
topologies that are typically larger than the simple topologies be to collect markings for all 7 edges. However, the expected
mentioned thus far and provide better abstractions of the time expression above must be modified to apply them to the
Internet structure. However, they do not provide an adequate packetmarkingproblembecauseonemayormaynot“draw”a
description of the Internet since they are typically restricted markedpacketin thepacketmarkingproblemandthemarked
to restricted snapshotof the Internetnetwork [16]. Traceroute packetshave unequalchancesof being received.Savage et al.
datasets have also been reported to be inaccurate when the [3] deal with the unequal edge probabilities by utilizing the
network has asymmetric paths [6]. probability of the least likely edge to providean upper bound
One common feature with these underlying topologies is on the expected convergence time.
their tree-like structure. A tree-structured topology G ex- Formally, given a single path of l hops implementing the
tree
hibitsa single pathfromanygivenattackerto the victim.The PPMschemewithroutermarkingprobabilityp,theleastlikely
choiceoftree-structuredtopologiesisbasedontheassumption edgeistypicallytheedgelocatedclosesttotheattackerwhich
that all attack traffic from one attacker will take the same has a probabilityp(1−p)l−1 of being receivedby the victim.
path to the victim. This assumption is in turn based on the Given d unique markings, the probability of receiving any
observationthatInternetpathsarelargelyinvariantparticularly markingat the victim is thereforeat least dp(1−p)l−1 which
over short periods of time [28]. These assumptions have is the product of the number of unique markings and the
allowed researchers to simplify the simulation process by probability of the least likely edge. The expected number of
ignoringthe routingcapabilities of the network and enforcing packets E[x] required to complete the marking “collection”
a predefined set of paths for attack traffic as evidenced in in order to build the attack graph is derived by dividing the
A A A'
B B a prohibitively complicated undertaken. The reasons for this:
C C Firstly,mostschemesutilizedifferentmetricstomeasuretheir
D D
performance. Secondly, many schemes are evaluated on dif-
E E
ferentkindsofunderlyingnetworktopologies,themajorityof
F G A'' F G
a which do not provide an adequate abstraction of the topology
1-a
of the Internet or focus on different characteristics of it.
I H I
Theunderlyingtopologiesaretypicallytree-structuredwith
a single path from an attacker to the victim. In contrast, the
Figure2. SampleattackpathslinkingattackerAtovictimI.Theattackpath
in Fig. b exhibits Subgraph 4 in which the traffic can either take path FGI topology of the Internet exhibits alternative routes that make
withprobability a,orpathFHIwithprobability 1−a. it both resilient and scalable. Both the disparity in metrics
and underlying topologies, as well as the inadequacy of the
topologies, raise questions about the performances of these
original coupon collector expectation by dp(1−p)l−1 which
schemes in the Internet. For example, which scheme would
yields [3].
perform best when deployed on an appropriate network—a
ln(d)
E [x]< (1) network similar to the Internet? Does the performance of a
0,PPM p(1−p)l−1
PPM-based scheme vary with the network on which it is
Therefore, the final expression for the upper bound of the implemented?Ifso,isitpossibletoprovideacommonground
expectedconvergencetime is obtainedby dividingthe natural for comparing scheme performance when the schemes are
logarithmofthenumberofdistinctedgesd,bytheprobability implemented on different networks? These questions show
p(1−p)l−1 of the least likely edge in the attack path. For the that there is a need to evaluate and compare the schemes
SP/SA topology, the number of hops is equal to the number on common and appropriate networks. The results of this
of unique markings (l=d). evaluation can then be used to determine which schemes are
In Fig. 2(a), the least likely edge is AB with a probability the most promising candidates for Internet deployment.
ofp(1−p)l−1.However,theprobabilityofreceivingedgeFG The work presented herein has implications that reach out-
in Fig. 2(b) is given by ap(1−p) which is considerably less side the field of IP traceback. For example, do other network
thanthe probabilityof ABfor shortpath lengths.Inthis case, protocols also display a dependence on the network topology
the convergencetime is given by Equation 2. exhibited by the network on which they are implemented? If
so,canthesedependenciesbeexploitedtoyieldbetterprotocol
ln(d)
E [x]< (2) performance in specific types of networks? These questions
1,PPM ap(1−p)
should encourage multiple topology evaluations of network
Comparing Equation 1 and Equation 2 reveals that the con- protocols.
vergence time is increased by a factor of (1−p)l−2. This
a REFERENCES
means that even in the best case when both alternative paths
[1] A.BelenkyandN.Ansari,“Oniptraceback,” IEEEComm.Magazine,
are equally likely (a = 0.5), the convergence time of a 3-
pp.142–153, 2003.
hop attack graph is multiplied by a factor of 1.92 while [2] Z. Gao and N. Ansari, “Tracing cyber attacks from the practical
a 15 hop attack graph is multiplied by a factor of 1.18. perspective,” IEEEComm.Magazine, pp.123–131, 2005.
[3] S.Savage,D.Wetherall,A.Karlin,andT.Anderson,“Practicalnetwork
If one of the two paths only carries a tenth of the traffic
supportforiptraceback,”ACMSIGCOMM,vol.30,pp.295–306,2000.
(a = 0.1), the convergence times of the 3-hop and 15-hop [4] D. X. Song and A. Perrig, “Advanced and authenticated marking
attack graphs are multiplied by a factor of 9.6 and 5.88 schemesforiptraceback,”inProc.ofIEEEINFOCOMConf.,pp.878–
886,2001.
respectively. Additionally, the alternative paths factor affects
[5] J. Mesit and M. R. Brust, “Secured node-to-node key agreement for
short attack paths more than long attack paths. wireless sensor networks,” in Proc. of International Conference on
The cases for TMS and PBS are similar. It can be shown Information Networking (ICOIN),pp.37–39,IEEE,2015.
[6] A.Yaar,A.Perrig,andD.Song,“Fit:fastinternettraceback,” inProc.
that for TMS and PBS the convergence time is increased by
ofIEEEINFOCOMConf.,pp.1395–1406,2005.
a factor of 1 regardless of the path length. [7] T.-Y. Wong, M.-H. Wong, and C.-S. Lui, “A precise termination con-
a
This shows that alternative paths reduce the probability of dition of the probabilistic packet marking algorithm,” IEEE Trans. on
Dependable andSecureComputing, pp.6–21,2008.
the least likely edges in an attack path for all the considered
[8] V.Paruchuri,A.Durresi,andS.Chellappan,“Ttlbasedpacketmarking
schemes, and consequentlyincreases their convergencetimes. foriptraceback,” inProc.ofIEEEGLOBECOMConf.,pp.1–5,2008.
[9] M.T.Goodrich, “Probabilistic packet markingforlarge-scale iptrace-
V. CONCLUSION back,”IEEE/ACMTrans.Netw.,pp.15–24,2008.
[10] Q.Yan,X.He,andT.Ning,“Animproveddynamicprobabilisticpacket
In this paper, ten PPM-based IP traceback schemes are markingforiptraceback,” IJCNIS, pp.47–53,2010.
compared and analyzed. Although, it is a non-exhaustive [11] Y.K.Tseng,H.H.Chen,andW.S.Hsieh,“Probabilisticpacketmarking
withnon-preemptivecompensation,”IEEEComm.Letters,pp.359–361,
study, the number of schemes is sufficiently representative to
2004.
show existing discrepancies in performance and performance [12] M. Ma, “Tabu marking scheme to speedup ip traceback,” Computer
evaluation between the schemes. Networks, pp.3536–3549, 2006.
[13] A. R. Kiremire, M. R. Brust, and V. V. Phoha, “A prediction based
As a result of this analysis, it becomescomprehensivewhy
approachtoiptraceback,”inProc.oftheIEEEConf.onLocalComputer
a direct comparison of PPM-based schemes turns out to be Networks, 2012.
[14] A. R. Kiremire, M. R. Brust, and V. Phoha, “Topology-dependent [22] W. Feller, An Introduction to Probability Theory and Its Applications.
performanceofattackgraphreconstructioninppm-basediptraceback,” Wiley,1968.
inIEEEConsumerCommunicationsandNetworkingConference(IEEE [23] R.CarterandM.Crovella, “Serverselection usingdynamic pathchar-
CCNC),pp.363–370,IEEE,2014. acterization in wide-area networks,” in Proc. IEEE INFOCOM Conf.,
[15] A.R.Kiremire,M.R.Brust,andV.V.Phoha,“Usingnetworkmotifsto pp.1014–1021, 1997.
investigatetheinfluenceofnetworktopologyonppm-basediptraceback [24] K. L. Calvert, M. B. Doar, and E. W. Zegura, “Modeling internet
schemes,”Computer Networks, 2014. topology,”IEEECommunications Magazine, pp.160–163, 1997.
[16] J. Li, M. Sung, J. Xu, and L. Li, “Large-scale ip traceback in high- [25] H. Tangmunarunkit, R. Govindan, S. Jamin, S. Shenker, and W. Will-
speedinternet:practicaltechniquesandtheoreticalfoundation,”inProc. inger,“Networktopologies, powerlaws,andhierarchy,”Comput.Com-
ofIEEESymp.onSecurityandPrivacy, pp.115–129,2004. mun.Rev,2001.
[17] R.Stone, “Centertrack: anipoverlay networkfortracking dosfloods,” [26] M. Faloutsos, P. Faloutsos, and C. Faloutsos, “On power-law relation-
inProc.ofUSENIXSecurity Symp.,pp.15–15,2000. shipsoftheinternettopology,”inACMSIGCOMM,pp.251–262,1999.
[18] S.Bellovin,M.Leech,andT.Taylor,“Icmptracebackmessages,”2003. [27] A.Medina, A.Lakhina,I.Matta, andJ.Byers,“Brite: Anapproachto
[19] A. Belenky and N. Ansari, “Ip traceback with deterministic packet universal topology generation,” inProc.ofthe Int.Symp. inModeling,
marking,”Comm.Letters, IEEE,pp.162–164,2003. AnalysisandSimulation ofComputer andTelecom. Systems,2001.
[20] M.-H. Yang and M.-C. Yang, “Riht: A novel hybrid ip traceback [28] V. Paxson, “End-to-end routing behavior in the internet,” ACM SIG-
scheme,”InformationForensicsandSecurity,IEEETrans.on,pp.789– COMM,pp.41–56,2006.
797,2012. [29] B. Waxman, “Routing of multipoint connections,” IEEE Journal on
[21] Y. Xiang, K. Li, and W. Zhou, “Low-rate ddos attacks detection and Selected AreasinComm.,pp.1617–1622, 1988.
tracebackbyusingnewinformationmetrics,”InformationForensicsand [30] B. N. Clark, C. J. Colbourn, and D. S. Johnson, “Unit disk graphs,”
Security, IEEETrans.on,pp.426–437,2011. DiscreteMathematics, pp.165–177,1990.