Table Of ContentSecuring Virtual Network Function Placement with
High Availability Guarantees
Marco Casazza∗, Pierre Fouilhoux†, Mathieu Bouet‡, and Stefano Secci†
∗Università degli Studi di Milano, Dipartimento di Informatica, Italy. Email: marco.casazza@unimi.it
†Sorbonne Universités, UPMC Univ Paris 06, UMR 7606, LIP6, France. Email: {pierre.fouilhoux,stefano.secci}@upmc.fr
‡Thales Communications & Security, France. Email: mathieu.bouet@thalesgroup.com
Abstract—VirtualNetworkFunctionsasaService(VNFaaS)is can induce the failure of all the overlying services [12]. To
currentlyunderattentivestudybytelecommunicationsandcloud achieve high availability, backup VNFs can be placed into the
stakeholders as a promising business and technical direction
7 NFVI, acting as replicas of the running VNFs, so that when
consisting of providing network functions as a service on a
1 the latter fail, the load is rerouted to the former. However,
cloud (NFV Infrastructure), instead of delivering standalone
0 network appliances, in order to provide higher scalability and not all VNFs are equal ones: the software implementing a
2
reduce maintenance costs. However, the functioning of such network function of the server where a VNF is running may
n NFVI hosting the VNFs is fundamental for all the services and bemorepronetoerrorsthanothers,influencingtheavailability
a applications running on top of it, forcing to guarantee a high of the overall infrastructure. Also, client requests may be
J availability level. Indeed the availability of an VNFaaS relies on
7 the failure rate of its single components, namely the servers, routed via different network paths, with different availability
2 the virtualization software, and the communication network. performance.Thereforetoguaranteehighlevelsofavailability
The proper assignment of the virtual machines implementing it is important not only to increase the number of replicas
] network functions to NFVI servers and their protection is placed on an NFVI, but it is also crucial to select where they
I essential to guarantee high availability. We model the High
are placed and which requests they serve.
N
AvailabilityVirtualNetworkFunctionPlacement(HA-VNFP)as
In this context, we study and model the High Availability
. the problem of finding the best assignment of virtual machines
s to servers guaranteeing protection by replication. We propose a Virtual Network Function Placement (HA-VNFP) problem,
c
[ probabilisticapproachtomeasuretherealavailabilityofasystem that is the problem of placing VNFs on an NFVI in order
anddesignbothefficientandeffectivealgorithmsthatcanbeused to serve a given set of clients requests guaranteeing high
1 by stakeholders for both online and offline planning.
availability. Our contribution consist of: (a) a quantitative
v
probabilistic model to measure the expected availability of
3 I. BACKGROUND
9 VNFplacement; (b)a proofthat theproblem isNP-hardand
9 Arecenttrendincomputernetworksandcloudcomputingis thatitbelongstotheclassofnonlinearoptimizationproblems;
7 tovirtualizenetworkfunctions,inordertoprovidehigherscal- (c) a linear mathematical programming formulation that can
0
ability, reducing maintenance costs, and increasing reliability be solved to optimality for instances of limited size; (d) a
.
1 of network services. Virtual Network Functions as a Service VariableNeighborhoodSearch(VNS)heuristicforbothonline
0 (VNFaaS) is currently under attentive study by telecommuni- and offline planning; (e) an extensive simulation campaign,
7
cations and cloud stakeholders, as a promising business and andalgorithmintegrationinaDecisionSupportSystem(DSS)
1
technical direction consisting of providing network functions tool [28].
:
v (i.e., firewall, intrusion detection, caching, gateways...) as a The paper is organized as follows: in Section II we present
i
X Service instead of delivering standalone network appliances. the HA-VNFP, and in Section III we briefly describe previous
r While legacy network services are usually implemented by works on VM/VNF placement in cloud/NFVI systems. In
a means of highly reliable hardware specifically built for a SectionIVweformallydescribetheoptimizationproblemand
single purpose middlebox, VNFaaS moves such services to propose a linearization of the problem and a mathematical
a virtualized environment [18], named NFV Infrastructure programming formulation that can solve it to optimality. In
(NFVI) and based on commercial-off-the-shelf hardware [22]. SectionVwedescribeourheuristicmethodologies,whichare
Services implementing network functions are called Virtual thentestedinanextensivesimulationcampaigninSectionVI.
Network Functions (VNFs). One of the open issues for NFVI We briefly conclude in Section VII.
design is indeed to guarantee high levels of VNF availability
[21], i.e., the probability that the network function is working II. HIGHAVAILABILITYVNFPLACEMENT
at a given time. In other words, a higher availability corre- We consider an NFVI with several geo-distributed data-
spondstoasmallerdowntimeofthesystem,anditisrequired centers or clusters (see Fig. 1). Each cluster consists of a
to satisfy stringent Service Level Agreements (SLA). Failures set of heterogeneous servers with limited available computing
may result in a temporary unavailability of the services, but resources. Several instances of the same VNF type can be
while in other contexts it may be tolerable, in NFVI network placed on the NFVI but on different servers. Each VNF
outages are not acceptable, since the failure of a single VNF instance can be assigned to a single server allocating some
of its computing resources. Indeed each server has a limited Cluster A Cluster B Cluster C
amount of computing resources that cannot be exceeded.
A network connects together all servers of the NFVI: S1 S3 S7 Slave
S5 S6 S8 S9
we suppose that the communication links inside a cluster Master
S2 S4
are significantly more reliable than those between servers in
different clusters. An access network with multiple access
AP1 AP2 AP3
points connects servers to clients. Links connecting access
R1 R2 R3
pointstoclusterscandifferintheavailabilitylevel,depending R4 R5 R6
on the type of the connection or the distance from the cluster.
Fig.2. AbstractrepresentationofVNFsplacementona3-clusterNFVI.Each
boxisaVNFinstance.InstancesrunningthesameVNFtypehavethesame
pattern.VNFswithagraybackgroundareslavesplacedasprotectionforthe
masters.SetsofrequestsareroutedfromaccesspointsandassignedtoVNFs
Cluster A Cluster B Cluster C runningtherequestedfunction.
S1 S3 S7
S5 S6 S8 S9
in [11], in such a way that the latter has always an updated
S2 S4
state and can consistently restore the computation in case
of failure of the former. We suppose that if a master is
unreachable, a searching process is started in order to find
AP1 AP2 AP3
a slave that can complete the computation. If the searching
Fig.1. AbstractrepresentationofanNFVI:eachserverisdepictedasawhite process fails and all slaves are unreachable, then the service
box whom height represents the amount of available resources. Clusters are
is consideredunavailable. Arepresentation of VNFprotection
connected together to allow synchronization operations. Access network is
representedbyaccesspointsconnectingclusterstotheexternalnetwork. is in Fig. 3.
Inthisarticle,weassumethatthetotalamountofresources S1 S2 S3 S4
and network capacities are sufficient to manage the expected
clientrequestsatanytime.Howeverassignmentdecisionsmay
artificially produce congestion over the servers. We analyze Fig.3. Exampleofprotection:masterVNFsarerunningonserversS2and
S4,whileslavearerunningonS1andS3.EachlinkbetweenVNFinstances
how to find assignments providing a trade-off between NFVI
representstheconnectionbetweenamasteranditsslave.
availability and system congestion.
We are given an estimation of the expected client VNF
requests,eachcharacterizedbyacomputingresourcedemand. III. RELATEDWORKS
An assigned request consumes part of the resources reserved Even if VM and VNF resource placement in cloud systems
by a VNF instance. Indeed the consumed resources must not is a recent area of research (see [16] for a high-level compre-
exceed the reserved ones. hensive study), however there already exists orchestrators that
Requests can be assigned using two different policies: are driven by optimization algorithms for the placement, such
demand load balancing and split demand load balancing. In as [23]. We now present few works in literature studying the
theformer,aclientrequestisalwaysfullyassignedtoasingle optimization problems that arises in this context.
server,whileinthelatteritmaybesplitamongdifferentones. a) PlacementofVirtualMachines: [20]studiestheprob-
Splitting a request also splits proportionally its demand of lem of placing VMs in datacenters minimizing the average
computing resources. Indeed, when a demand is split it relies latency of VM-to-VM communications. Such a problem is
on the availabilities of many VNF instances, decreasing the NP-hard and falls into the category of Quadratic Assignment
expected availability of the service, but increasing the chance Problems. The authors provide a polynomial time heuristic
of finding a feasible assignment in case of congestion. algorithmsolvingtheproblemina"divideetimpera"fashion.
We suppose a multi-failure environment in which VNFs, In [3] the authors deal with the problem of placing VMs in
servers, clusters, and networks may fail together. Our aim is geo-distributed clouds minimizing the inter-VM communica-
to improve the VNF availability by replicating instances on tiondelays.Theydecomposetheprobleminsubproblemsthat
the NFVI. We distinguish between master and slave VNFs: they solve heuristically. They also prove that, under certain
the former are active VNF instances, while the latter are idle conditions,oneofthesubproblemscanbesolvedtooptimality
until masters fail. An example of VNF placement is depicted in polynomial time. [6] studies the VM placement problem
in Fig. 2. minimizingthemaximumratioofthedemandandthecapacity
Each master may be protected by many slaves - we assume acrossallcutsinthenetwork,inordertoabsorbunpredictable
in this article that a slave can protect only a master, must be traffic burst. The authors provide two different heuristics to
placedonadifferentserver,andmustallocateatleastthesame solve the problem in reasonable computing time.
amount of computing resources of its master. b) Placement of Virtual Network Functions: [4] applies
Each master periodically saves and sends its state to its NFV to LTE mobile core gateways proposing the problem of
slaves, e.g. using technologies such as the one presented placing VNFs in datacenters satisfying all client requests and
latencyconstraintswhileminimizingtheoverallnetworkload. b) Virtual Network Functions: A set F of VNF types is
Instead, in [8] the objective function requires to minimize the given.EachVNFinstancerunsonasingleserver.Eachserver
total system cost, comprising the setup and link costs. [17] can host multiple VNF instances, but at most one master for
introduces the VNF orchestration problem of placing VNFs each type.
and routing client requests through a chain of VNFs. The c) Networks: An inter-cluster network allows synchro-
authors minimize the setup costs while satisfying all client nization between clusters, while an access network connects
demands. They propose both an ILP and a heuristic to solve clusters to a set of access points P. We are given sets L
C
suchproblem.Also[1]considerstheVNForchestrationprob- and L of logical links (c(cid:48),c(cid:48)(cid:48)) ∈ L connecting clusters
P C
lem with VNF switching piece-wise linear latency function c(cid:48),c(cid:48)(cid:48) ∈ C, and logical links (c,p) ∈ L connecting cluster
P
and bit-rate compression and decompression operations. Two c∈C to access point p∈P, respectively.
differentobjectivefunctionsarestudied:oneminimizingcosts d) Clients requests: A set of clients requests R is given.
and one balancing the network usage. Each request r ∈ R is a tuple (f ,P ,d ) of the requested
r r r
c) Placement with protection: In [5] VMs are placed VNFtypef ∈F,asubsetofavailableaccesspointsP ⊆P,
r r
with a protection guaranteeing k-resiliency, that is at least k and the resources demand d ∈R .
r +
slaves for each VM. The authors propose an integer formu- e) Availability: Taking into account explicit availability
lation that they solve by means of constraint programming. inNFVIdesignbecomesnecessarytoensureSLAs[12],[22].
In [15] the recovery problem of a cloud system is consid- We suppose that the availabilities of each component (server,
ered where slaves are usually turned off to reduce energy cluster,VNF,link)aregiven(seeTableI),eachcorresponding
consumption but can be turned on in advance to reduce the to the probability that a component is working.
recovery time. The authors propose a bicriteria approximation f) Objective function: All clients requests must be as-
algorithm and a greedy heuristic. In [27] the authors solve
signed to servers maximizing the availability of the system,
a problem where links connecting datacenters may fail, and
we measure as the minimum availability among all requests.
a star connection between VMs must be found minimizing
the probability of failure. The authors propose an exact and
TABLEI
a greedy algorithms to solve both small and large instances,
MATHEMATICALNOTATION.
respectively. Within disaster-resilient VM placement, [9] pro-
poses a protection scheme in which for each master a slave is C Setofclusters S Setofservers
selected on a different datacenter, enforcing also path protec- Sc Setofserversinclusterc F SetofVNFtypes
R Setofrequests P Setofaccesspoints
tion.In[2]theauthorssolvetheproblemofplacingslavesfor LC Setofsynchro.links LP Setofaccesslinks
a given set of master VMs without exceeding neither servers Pr Setofrequestaccess
points
nor link capacities. Their heuristic approaches decompose the
problems in two parts: the first allocating slaves, and the qs Capacityofservers dr Demandofrequestr
second defining protection relationships. cs Clusterofservers fr VNFofrequestr
Inarecentwork[25],theauthorsmodeltheVMavailability aFf AvailabilityofVNFf aSs Availabilityofservers
by means of a probabilistic approach and solve the placement aLccC(cid:48) Alinvkail(acb,icli(cid:48)t)yofsynchro. aLcpP A(cv,apil)abilityofaccesslink
problem over a set of servers by means of a nonlinear math-
aC Availabilityofclusterc
ematical formulation and greedy heuristics. This is the only c
work offering an estimation of the availability of the system.
However, it considers only the availability of the servers,
A. Computational complexity
while in our problem we address a more generic scenario:
when datacenters are geo-distributed, a client request shall be Concerning the assignment of requests, we can prove that:
assigned to the closest datacenter, since longer connections
Observation 1. When demand split is allowed and
may have a higher failure rate. Therefore, the source of the (cid:80) (cid:80)
d ≤ q , HA-VNFP has always a feasible so-
client requests may affect the placement of the VNFs on the r∈R r s∈S s
lution that can be found in polynomial time.
NFVI, and must be taken into account in the optimization
process and in the estimation of the availability. In fact since the requests can be split among servers, the
feasibility of an instance can be found applying a Next-Fit
IV. MODELING greedy algorithm for the Bin Packing Problem with Item
In the following we propose a formal definition to the HA- Fragmentation (BPPIF) [19]: servers can be seen as bins,
VNFP and a mathematical programming formulation. while requests as items that must be packed into bins. The
a) Clusters and servers: We are given the set of clusters algorithmiterativelypackitemstoanopenbin.Whenthereis
C and the set of servers S. Each server s belongs to a cluster not enough residual capacity, the item is split, the bin is filled
c , and we define as S ⊆ S the set of servers of cluster c. and closed, and a new bin is open packing the rest of the
s c
We represent the usual distinct types of computing resources item. When requests can be split, such algorithm produces a
(CPU, RAM, ... ) of server s∈S by the same global amount feasible solution for the HA-VNFP: if a request is assigned to
q ∈R of available resources. aserver,thenamasterVNFservingsucharequestisallocated
s +
on that server too. The Next-Fit algorithm runs in O(|R|) and Cluster A Cluster B Cluster C
therefore a feasible solution can be found in polynomial time.
S1 S3 S7
Observation 2. The feasibility of a HA-VNFP instance with- S5 S6 S8 S9
out demand split is a NP-hard problem. S2 S4
Indeed we can see again the feasibility problem as a Bin
AP1 AP2 AP3
Packing Problem (BPP). However, without split each item
R1
must be fully packed into a single bin. Therefore, finding a
Fig. 4. Example of assignment configuration γ =
feasiblesolutionisequivalenttothefeasibilityofaBPP,which
{(S2,{S1,S2,S7}),(S5,{S3,S5,S6})}, where request R1 is split
is NP-hard, and it directly follows that: and assigned to two different master VNFs on servers S2 and S5. Both
mastershaveslaves:themasterVNFonserverS2hasslavesonserversS1
Theorem1. TheHA-VNFPwithoutdemandsplitisNP-hard. andS7,whiletheoneonserverS5hasslavesonserversS3andS6.
That is, for unsplittable demands, it is NP-hard finding
a(r,ω) is the function computing the probability that at least
both a feasible solution and the optimum solution. It is less
one of the instances of ω is working:
straightforward to also prove that:
Theorem 2. The HA-VNFP with demand split is NP-hard. a(r,ω)=1−[ (1−aLP(cs,Pr)·aCcs ·aS(fr,Sp∩Scs)) ·
(cid:124) (cid:123)(cid:122) (cid:125)
availability of the cluster containing master
Proof: In fact, let us suppose a simple instance where
(cid:89)
all components (servers, clusters, links, ...) are equal ones and · (1−aLP(c,P )·aC ·aLC ·aS(f ,S ∩S )]
where (cid:80)r∈Rdr = (cid:80)s∈Sqs, which means that there will be c∈C\{cs}a(cid:124)vailability ofrclustcer(cid:123)c(cid:122)ocnstcaining ornlypslavec(cid:125)s
no slaves in our placement. The problem can be seen again
as a BPPIF in which the objective is to minimize the number When a request r is split, we compute its availability a(r,γ)
of splits of the item that is split the most: in fact, every time as the probability that all of its parts succeed: a(r,γ) =
(cid:81)
a request is split, the availability of the system decreases. In a(r,ω). We remark that such formula is nonlinear
(s,ω)∈γ
suchscenariosthebestsolutionistheoneinwhichnorequest and produces a Integer Nonlinear Programming formulation
is split at all - however, if we could solve such a problem which cannot be solved by common integer solvers like
in polynomial time, then we could solve also the feasibility CPLEX. Therefore we propose a MIP linearization of such
problem of a BPP in polynomial time, which instead is NP- nonlinear formulation in which for each assignment config-
hard. Therefore, since we can reduce a feasibility problem of uration γ ∈ Γ we have a binary variable stating if such
BPP to an instance of BPPIF, and the latter to an instance of configuration is selected in the solution.
HA-VNFP, the HA-VNFP with split is NP-hard. h) Variables: The following variables are needed:
x : fraction of request r assigned to server s
rs
(cid:40)
B. Mathematical formulation 1, if γ is active for request r
z =
rγ
0, otherwise
In the following we propose a mathematical programming
u : resources consumed by VNF f on s
formulation of HA-VNFP starting from the definition of the fs
set of the solutions: a request assignment ω is a pair (s,Sp) vfss(cid:48) : resources consumed by slave on server s(cid:48)
indicating the subset of servers S ⊆ S running either the A : minimum availability
p min
master or the slaves of a VNF instance, and the server s∈S
p i) Model: HA-VNFP can be modeled as follows:
where the master is placed. We also define Ω = {(s,S ) |
p
S ⊆ S,s ∈ S } as the set of all request assignments. An max A (1)
p p min
assignment configuration γ (see Fig. 4) is a set of all request (cid:88)
s.t. xrs =1 ∀r∈R (2)
assignmentsω forallthefragmentsofarequest.Wedefineas
s∈S
Γ the set of all assignment configurations γ, that is Γ={γ ∈ (cid:88)
2Ω |s(cid:48) (cid:54)=s(cid:48)(cid:48),∀(s(cid:48),ω(cid:48)),(s(cid:48)(cid:48),ω(cid:48)(cid:48))∈γ}. xrs ≤ zrγ ∀r∈R,s∈S (3)
γ∈Γ
g) Availability computation: We compute the NFVI ∃(s,ω)∈γ
availability for a request r by means of a probabilistic ap- (cid:88)
dr·xrs ≤ufs ∀f∈F,s∈S (4)
proach [7], [14]. Given a cluster and a set of access points,
r∈R
aLP(c,P) is the function computing the probability that at fr=f
(le(cid:81)ast on1e−ofaLthPe).aGcciveesns alinVkNsFisanwdoraksinegt:ofasLePrv(ce,rsP,)aS=(f1,S−) ufrs+qs·(cid:88)zrγ ≤vfrss(cid:48) +qs ∀r∈R,s,s(cid:48)∈S (5)
p∈P cp γ∈Γ
is the function computing the probability that at least one in- ∃(s,ω)∈γ|s(cid:48)∈ω
stanceofVNFisworking:aS(f,S)=1−((cid:81) 1−aF·aS). (cid:88) (cid:88)
s∈S f s ufs+ vfs(cid:48)s ≤qs ∀s∈S (6)
Given a request r and a request assignment ω = (s,S ),
p f∈F s(cid:48)∈S
(cid:88)
zrγ ≤1 ∀r∈R (7) havingenoughcapacityforaslavestillusing SELECTSERVER
γ∈Γ procedure. After a server is found, the slave is placed. Such a
(cid:88) procedure is repeated until no additional slave is placed.
a(r,γ)·zrγ ≥Amin ∀r∈R (8)
γ∈Γ
Algorithm 1 Greedy heuristic procedure
Constraints (2) and (3) ensure that each request is fully 1: function GREEDY(R,S,split)
assignedandselectsanassignmentconfiguration,respectively. 2: placement←∅
Constraints (4) and (5) set the allocated resources of masters 3: for all r∈R|dr >0 do (cid:46) Assignment of requests
and slaves, respectively. Constraints (6) ensure that servers 4: if ∃sˆ←SELECTSERVER(dr,S,split) then
capacities are not exceeded. Constraints (7) impose that at 5: create VNF fr if it does not exists in placement
6: assign request r to server sˆin placement
most one assignment configuration is selected for each re- 7: demand dr is decreased
quest. Constraints (8) compute the minimum availability. Our 8: else
formulation can model both the HA-VNFP with and without 9: return infeasible
split: in fact by simply setting |γ|=1 for each configuration 10: end if
11: end for
γ we forbid configurations splitting a request.
12: do (cid:46) Add slaves
13: for all VNFs v∈placement do
V. HEURISTICS 14: S¯← servers of S without v and its slaves
Solving HA-VNFP as a MIP using an integer solver works 15: if ∃sˆ←SELECTSERVER(dv,S¯,FALSE) then
only for small NFVI, since the number of variables is expo- 16: create slave of VNF v on server sˆin placement
17: end if
nential w.r.t the size of the instances. Therefore we propose
18: end for
two different heuristic approaches for HA-VNFP: the first 19: while slaves are found
is an adaptation of well-known greedy policies for the BPP 20: return placement
that will serve as comparison, while the second is a Variable 21: end function
Neighborhood Search heuristic using different algorithmic
operators to explore the neighborhood of a starting point.
B. Variable Neighborhood Search
A. Greedy heuristics The Variable Neighborhood Search (VNS) is a meta-
heuristic that systematically changes the neighborhood within
Most of the heuristics for the placement of VMs or VNFs
the local search algorithm, in order to escape from local
are based on a greedy approach, and BPP heuristics are often
optima. In other words, it starts from an initial solution,
exploited to obtain suitable algorithms for the placement,
applies a local search algorithm until it improves, and then
such as in [24]. We also exploit BPP heuristics to obtain
changes the type of local search algorithm applied to change
three different greedy approaches for the HA-VNFP: Best
the neighborhood. Our VNS algorithm explores 4 different
Availability, Best Fit, and First Fit greedy heuristics. The
neighborhoodsanditisinitializedwithseveralstartingpoints,
algorithm,reportedinAlgorithm1,startsfromanemptyinitial
each obtained using a different greedy algorithm.
placement and for each request r it looks for a server having
enough residual capacity to satisfy the demand d . If such a
r Algorithm 2 Variable Neighborhood Search
serverisfound,thentherequestisassignedtoit,otherwisethe
1: function VNS(R,S,split)
algorithm fails without finding a feasible solution. However,
2: startingPoints ← { BESTAVAILABILITY(R,S,split),
we can observe that: BESTFIT(R,S,split), FIRSTFIT(R,S,split) }
(cid:80) (cid:80) 3: operators←{vnfSwap,slaveSwap,requestSwap,
Observation 3. When d ≤ q and split is
r∈R r s‘inS s requestMove}
allowed, our greedy heuristic always finds a feasible solution. 4: bestPlacement←∅
5: for all placement∈startingPoints do
In fact we can always split a request between two servers,
6: do
as stated also in Observation 1. 7: for all op∈operators do
The selection of the server is performed by the procedure 8: placement← apply op to placement
SELECTSERVER(S¯,d¯,split) which discards the servers with- 9: if placement improves bestPlacement then
out sufficient resources to satisfy demand d¯, and selects a 10: bestPlacement←placement
11: break
server depending on the chosen policy:
12: end if
• best fit: the server whose capacity best fits the demand; 13: end for
• first fit: the first server found; 14: while improving bestPlacement
15: end for
• best availability: the server with the highest availability.
16: return bestPlacement
While the first three policies are well-know for the BPP, the 17: end function
fourth one is designed for the HA-VNFP.
Master VNFs are placed during the assignment of the The main logic of our VNS algorithm is sketched in
requests. Then, in a similar way, the algorithm places addi- Algorithm2:wegenerate3startingpointsbyusingthegreedy
tional slaves: for each master the algorithm looks for a server heuristics of Section V-A and we explore their neighborhood
for a placement improving the availability. If no improvement S1 S2 S3 S4 S1 S2 S3 S4
can be found, the algorithm switches the neighborhood.
Indeed applying local search is time expensive, but we can
observethatamax-minobjectivefunctiondividestherequests R1 R2 R3 R4 R1 R2 R3 R4
intwosets:asetofrequestshavinganavailabilityequaltothe
Fig. 7. Example of requests swap neighborhood: requests R2 and R3 are
objective function and another set having a better availability. swapped changing the respective servers. When swapping a request, a new
We refer to the former as the set of the worst requests, since VNFinstanceiscreatedifnoneexistedonthenewserver.
they are the ones whose improvement will also improve the
S1 S2 S3 S4 S1 S2 S3 S4
availability of the entire solution. To reduce the computing
time and focus our algorithm we found to be profitable to
restrict the explored neighborhood to the worst requests only.
R1 R2 R3 R4 R1 R2 R3 R4
Also, after applying each operator we look for new slaves, as
in the greedy procedure Algorithm 1. Fig. 8. Example of request move neighborhood: request R2 is moved and
assignedtoadifferentserver.
Giventwofeasibleplacements,wesaythatoneisimproving
if it has a higher availability or if it has the same availability
but fewer worst requests.
number of iterations can be set, obtaining a O(k·|R|·|S|·
InthefollowingwedescribetheneighborhoodsofourVNS.
max{|R|,|F|·|S|}) heuristic. Also, in the following we show
a) VNFsswap: Thefirstneighborhoodconsistsofswap-
that our VNS requires small computing time for NFVI of
pingVNFs(seeFig.5):givenaVNF,weswapitwithasubset
limited size and it can be parameterized to end within a time
of VNFs deployed on a different server. If the placement is
limit, making it suitable for both online and offline planning.
improved, then we store the result as the best local change.
In general our operator is O(2|F|·|S|) but we found prof- VI. SIMULATION
itable to set an upper bound of 1 to the cardinality of the set
We evaluate empirically the quality of our methodologies:
of swapped VNFs, obtaining a O(|F|·|S|) operator.
the greedy heuristic using four different policies (best fit,
first fit, and best availability), the VNS algorithm, and the
S1 S2 S3 S4 S1 S2 S3 S4
mathematical programming formulation as a MIP. However
we could run our MIP only on small instances with 3 or
4 servers and 50 requests. In our framework we first run
R1 R2 R3 R4 R1 R2 R3 R4
the algorithms using the demand load balancing policy, and
Fig.5. ExampleofVNFsswapneighborhood:VNFsonserverS2andS3 allowingsplitonlyiftheformerfailstoassignalltherequests.
areswapped.IfaVNFisamaster,thenallitsassignedrequestsareredirected
All methodologies are implemented in C++, while CPLEX
tothenewserver.
12.6.3 [10] is used to solve the MIP. The simulations have
b) Slave VNFs swap: We explore the neighborhood been conducted on Intel Xeon X5690 at 3.47 GHz. We
where a slave VNF is removed to free resources for an also produced a graphical DSS tool integrating the VNS and
additional slave of a different master VNF (see Fig. 6). The the greed algorithms (in python) working on arbitrary 2-hop
complexity of this operator is O(|F|·|S|). topologies and made it available in [28].
1) Dataset generation: We generated a random dataset
S1 S2 S3 S4 S1 S2 S3 S4 consisting of instances that differ for the number of requests,
total amount of computing resources, and availabilities of
the network components. We set the number of VNF types
R1 R2 R3 R4 R1 R2 R3 R4
provided by our NFVI to |F| = 5. We assumed an NFVI
Fig.6. ExampleofslaveVNFsswapneighborhood:aslaveisremovedfrom with 3 clusters (|C| = 3) and 3 access points (|P| = 3).
theplacementinordertofreeresourcesforaslaveofadifferentmasterVNF. Each request has a random demand d ∈ [1,10], while each
r
server has a random capacity q ∈[75,125]. The availabilities
c) Requests swap: We also explore the neighborhood s
of all the components of our NFVI are selected between
where requests are swapped (see Fig. 7): given a request we
{0.9995,0.9999,0.99995,0.99999} as in [13], [25], [26].
considerasubsetofrequestsassignedtoadifferentserverand
We generated 30 instances for each combination of:
then swap the former with the latter. Similarly to the swap of
VNFs,thecomplexityofthisoperatorisO(2|R|).However,by • number of requests |R|={50,100,200,300,400,500};
setting an upper bound of 1 to the cardinality of the swapped • numberofaccesspointsforeachrequest|Pr|∈{1,2,3}.
requests set we obtain a O(|R|) operator. The number of servers depends on the number of requests,
d) Requestmove: Inthelastexplorationweconsiderthe the total amount of the demands, and the random capacities:
neighborswherearequestissimplymovedtoadifferentserver we generated a set of servers such that their capacities are
(cid:80) (cid:80)
(see Fig. 8). The complexity of this operator is O(|S|). enough to serve all the demands Q = q ≥ d .
s∈S s r∈R r
In principle, even if all the operators polynomial time our Note that such condition guarantees the feasibility only when
VNS algorithm is not . However, an upper bound k to the splitting requests is allowed. Servers are randomly distributed
best availability best fit first fit VNS MIP 1.000000
1e+04
0.999891
1e+03
0.999782
e (s)1e+02 ability0.999673
g tim1e+01 avail0.999564
ge computin11ee−+0010 e minimum 00..999999344565
a g
Aver1e−02 Avera0.999237
1e−03 0.999128 best availability
best fit
0.999020 first fit
1e−04 VNS
MIP
0.998911
1e−05
1 Q 1.25 Q 1.5 Q 1.75 Q 2 Q 2.25 Q 2.5 Q 2.75 Q 3 Q 1 Q 1.5 Q 2 Q 2.5 Q 3 Q
Server computing resources Server computing resources
(a) Averagecomputingtime(verticalaxisinlog.scale). (b) Averageminimumavailability.
Fig.9. Resultsforinstanceswith50requests.
among all the clusters, in such a way that for each pair Fig. 10(a), Fig. 10(b), and Fig. 10(c) we report the average
of clusters c and c(cid:48) we have |S − S | ≤ 1. Under these availability when requests can be routed to the NFVI using
c c(cid:48)
conditions we obtained instances with around 3 servers when 1, 2, and 3 access points, respectively. The path protection
|R|=50 and 28 servers when |R|=500. obtained by using more than one access point substantially
increases the level of the availability. However, having more
A. Comparison on small instances
than 2 access points does not provide additional benefits.
WefirstevaluatethequalityofourVNSheuristicagainstthe
B. Scaling up the number of requests
solutionsobtainedbytheMIPsolverandthegreedyheuristics.
In order to study how the NFVI behaves on different levels In a second round of experiments, we evaluated how our
of congestion, we let its computing resources to grow from VNS algorithm behave when scaling up the number of re-
(cid:80)
an initial value of Q = q , to 3 · Q, with a step of quests. However, when the number of servers increases its
s∈S s
0.25·Q.Duetotheexponentialnumberofthevariablesofour is not possible to use the MIP. Therefore, in the following
formulation,theMIPsolvercouldhandleonlysmallinstances analysis we compare the results of our VNS algorithm to the
with 50 requests. All tests have been performed setting a time greedyonesonly.Inaddition,sinceourVNSalgorithmhasnot
limit of two hours each and all the algorithms managed to polynomialtimecomplexity,weincludeinthecomparisonthe
assign all requests without splits. resultsobtainedbysettingatimelimitof10sattheexploration
In Fig. 9(a) we show the average computing time of the of each starting solution.
algorithms: while the MIP hits several times the time limit, We can first observe in Fig. 11(a) that the computing time
computingtimesarenegligibleforalltheheuristics,andVNS of the VNS algorithm grows exponentially when the number
canbeconsideredasagoodalternativeforonlineorchestration of requests increases. Indeed setting a time limit reduces the
whenthesetoftherequestsissmall.Theoptimizationproblem overall computing time, which is always less than a minute.
seems harder when the amount of computing resources is In Fig. 11(b) we show that on average there is always
scarce:infact,theaveragecomputingtimeoftheMIPiscloser a substantial gap between the results obtained by our VNS
to the time limit when the overall capacity is less than 2·Q. algorithmsandthegreedyheuristics.Wecanalsoobservethat
Instead, with higher quantities of resources the MIP always on average the VNS time limit does not penalize substantially
find the optimal solution within the time limit. the results. Therefore our VNS algorithm with time limit can
In Fig. 9(b) we show that the results of our VNS heuristic reduce computing times with minimal loss in availability.
are close to the MIP ones, while there is a significant gap From Fig. 12(a) to Fig. 12(e) we show the results on
between the latters and the greedy heuristic ones. In fact, instances having from 100 to 500 requests individually. We
both the MIP and the VNS succeed in finding solutions with can observe that the greedy heuristics progressively loose in
an availability of three nines even with scarce resources. quality of the solutions, and the gap with the VNS algorithms
Eventuallyallthealgorithmsreachanhighlevelofavailability increases with the number of the requests.
when computing resources are doubled. FinallyinFig.12(f)weshowonlytheresultsconcerningthe
In Fig. 10 we show the variation of the availability when VNS algorithm without time limit and how it behaves when
the number of access points for each request increases: in both the number of requests and the overall capacity increase.
1.000000 1.000000 1.000000
0.999869 0.999902 0.999902
0.999737 0.999804 0.999804
m availability00..999999467046 m availability00..999999670097 m availability00..999999670097
mu0.999343 mu0.999511 mu0.999511
Average mini00..999999028102 Average mini00..999999341153 Average mini00..999999341153
0.998949 best availability 0.999218 best availability 0.999218 best availability
best fit best fit best fit
0.998817 first fit 0.999120 first fit 0.999120 first fit
VNS VNS VNS
MIP MIP MIP
0.998686 0.999022 0.999022
1 Q 1.5 Q 2 Q 2.5 Q 3 Q 1 Q 1.5 Q 2 Q 2.5 Q 3 Q 1 Q 1.5 Q 2 Q 2.5 Q 3 Q
Server computing resources Server computing resources Server computing resources
(a) 1accesspointforrequest. (b) 2accesspointsforrequest. (c) 3accesspointsforrequest.
Fig.10. Averageminimumavailabilitiesforinstanceswith50requestsanddifferentnumberofaccesspointsforeachrequest.
best availability best fit first fit VNS VNS 10s 1.000000
1e+03
0.999870
1e+02 0.999741
y
e (s)1e+01 abilit0.999611
ge computing tim11ee−+0010 e minimum avail000...999999999234258322
a g
Aver1e−02 Avera0.999093 bbeesstt afitvailability
1e−03 0.998964 first fit
VNS
1e−04 0.998834 VNS 10s
0.998705
1e−05
|R|= 100 |R|= 200 |R|= 300 |R|= 400 |R|= 500 1 Q 1.5 Q 2 Q 2.5 Q 3 Q
Number of requests Server computing resources
(a) Averagecomputingtime(verticalaxisinlog.scale). (b) Averageminimumavailability.
Fig.11. Resultsforinstanceswithupto500requests.
We can observe that the curves are similar and the quality of in computing times smaller of orders of magnitude. We high-
the solutions provided by our algorithm is not affected by the lighted the substantial gap between the availability obtained
increasing of the size of the instances. Our VNS algorithm using classic greedy policies, and the one obtained with a
alwaysprovidesplacementswithanavailabilityofthreenines more advanced VNS algorithm, when the NFVI is congested.
even when resources are scarce, and it always reaches four OurVNSalgorithmshowedtobeagoodcompromisetosolve
nines when the capacity is doubled. HA-VNFPinreasonablecomputingtime,provingtobeagood
alternative for both online and offline planning.
VII. CONCLUSION
We integrated our HA-VNFP algorithms in a graphical
WedefinedandmodeledtheHA-VNFP,thatistheproblem simulator made available with tutorial videos in [28].
of placing VNFs in NFVI guaranteeing high availability.
We provided a quantitative model based on probabilistic ACKNOWLEDGMENTS
approachestoofferanestimationoftheavailabilityofaNFVI. This article is based upon work from COST Action
We proved that the arising nonlinear optimization problem is CA15127 ("Resilient communication services protecting end-
NP-hardandwemodelleditbymeansofalinearformulation user applications from disaster-based failures - RECODIS")
with an exponential number of variables. However, to solve supported by COST (European Cooperation in Science and
instances with a realistic size we designed both efficient and Technology). This work was funded by the ANR Reflex-
effective heuristics. By extensive simulations, we show that ion (contract nb: ANR-14-CE28-0019) and the FED4PMR
our VNS algorithm finds solution close to the MIP ones, but projects.
1.00000 1.000000 1.000000
0.99988 0.999870 0.999863
0.99976 0.999740 0.999726
m availability 00..9999995624 m availability00..999999468100 m availability00..999999455930
mu 0.99940 mu0.999350 mu0.999316
Average mini 00..9999991268 Average mini00..999999028199 Average mini00..999999014739
0.99904 best availability 0.998959 best availability 0.998906 best availability
best fit best fit best fit
0.99892 first fit 0.998829 first fit 0.998769 first fit
VNS VNS VNS
VNS 10s VNS 10s VNS 10s
0.99880 0.998699 0.998632
1 Q 1.5 Q 2 Q 2.5 Q 3 Q 1 Q 1.5 Q 2 Q 2.5 Q 3 Q 1 Q 1.5 Q 2 Q 2.5 Q 3 Q
Server computing resources Server computing resources Server computing resources
(a) Instanceswith100requests. (b) Instanceswith200requests. (c) Instanceswith300requests.
1.00000 1.000000 1.000000
0.99987 0.999868 0.999910
0.99974 0.999736 0.999819
m availability 00..9999994681 m availability00..999999467013 m availability00..999999673289
mu 0.99935 mu0.999339 mu0.999548
Average mini 00..9999990292 Average mini00..999999027057 Average mini00..999999346577 5100 0r erqeuqeusetssts
0.99896 best availability 0.998943 best availability 0.999277 200 requests
best fit best fit 300 requests
0.99883 first fit 0.998810 first fit 0.999186 400 requests
VNS VNS 500 requests
VNS 10s VNS 10s
0.99870 0.998678 0.999096
1 Q 1.5 Q 2 Q 2.5 Q 3 Q 1 Q 1.5 Q 2 Q 2.5 Q 3 Q 1 Q 1.5 Q 2 Q 2.5 Q 3 Q
Server computing resources Server computing resources Server computing resources
(d) Instanceswith400requests. (e) Instanceswith500requests. (f) VNSresultsonly.
Fig.12. Averageminimumavailabilitiesforinstanceswithupto500requests.
REFERENCES [14] IEC. Analysistechniquesfordependability–Reliabilityblockdiagram
method,1991.
[1] B. Addis, D. Belabed, M. Bouet, S. Secci. Virtual Network Functions [15] A.Israel,D.Raz. Costawarefaultrecoveryinclouds. IEEE/IFIPIM
PlacementandRoutingOptimization. IEEECLOUDNET2015. 2013.lobalServerHardware,ServerOSReliabilityReport,2014.
[2] H.A.Alameddine,S.Ayoubi,C.Assi. Protectionplandesignforcloud [16] B. Jennings, R. Stadler. Resource management in clouds: Survey and
tenantswithbandwidthguarantees. DRCN2016. researchchallenges. J.NetworkandSystemsManagement23(3),2014.
[3] M. Alicherry, T. V. Lakshman. Network aware resource allocation in [17] M.C.Luizelliet.al. PiecingtogethertheNFVprovisioningpuzzle:Ef-
distributedclouds. IEEEINFOCOM2012. ficientplacementandchainingofvirtualnetworkfunctions. IEEE/IFIP
[4] A.Bastaetal. ApplyingNFVandSDNtoLTEmobilecoregateways, IM2015.
the functions placement problem. Proc. of the 4th Workshop on All [18] J.Martinsetal. Clickosandtheartofnetworkfunctionvirtualization.
ThingsCellular:Operations,Applications,Challenges,2014. USENIXNSDI2014.
[5] E. Bin et al. Guaranteeing high availability goals for virtual machine [19] N. Menakerman, R. Rom. Bin Packing with Item Fragmentation.
placement. ICDCS2011. AlgorithmsandDataStructures:7thInternationalWorkshop,2001.
[6] O.Biranetal.Astablenetwork-awarevmplacementforcloudsystems. [20] X. Meng, V. Pappas, L. Zhang. Improving the scalability of data
IEEE/ACMCCGrid2012. center networks with traffic-aware virtual machine placement. IEEE
[7] A.Birolini. ReliabilityandAvailabilityofRepairableSystems. Relia- INFOCOM2010.
bilityEngineering:TheoryandPractice,2013. [21] R.Mijumbietal. Networkfunctionvirtualization:State-of-the-artand
[8] R.Cohen,L.Lewin-Eytan,J.S.Naor,D.Raz. Nearoptimalplacement researchchallenges. IEEECommunicationsSurveysTutorials,2016.
ofvirtualnetworkfunctions. IEEEINFOCOM2015. [22] J. Mo, B. Khasnabish. NFV Reliability using COTS Hardware, draft-
[9] R. S. Couto, S. Secci, M.E.M. Campista, L.H. M.K. Costa. Server mlk-nfvrg-nfv-reliability-using-cots-01,2015.
placement with shared backups for disaster-resilient clouds. Computer [23] J. F. Riera et al. TeNOR: Steps towards an orchestration platform for
Networks,2015. multi-PoPNFVdeployment. IEEENetSoft2016.
[10] CPLEX development team. IBM ILOG CPLEX optimization studio [24] P. Silva, C. Perez, F. Desprez. Efficient Heuristics for Placing Large-
cplexuser’smanual-version12release6,2015. ScaleDistributedApplicationsonMultipleClouds.IEEE/ACMCCGRID
[11] B. Cully et al. Remus: High availability via asynchronous virtual 2016.
machinereplication. USENIXNSDI2008. [25] S.Yang,P.Wieder,R.Yahyapour. ReliableVirtualMachineplacement
[12] ETSI.ETSIGSNFV-REL003V1.1.1NetworkFunctionsVirtualisation indistributedclouds. RNDM2016.
(NFV); Reliability; Report on Models and Features for End-to-End [26] Q.Zhang,M.F.Zhani,M.Jabri,R.Boutaba. Venice:Reliablevirtual
Reliability,2016. datacenterembeddinginclouds. IEEEINFOCOM2014.
[13] P. Gill, N. Jain, N. Nagappan. Understanding network failures in data [27] Y. Zhu et al. Reliable resource allocation for optically interconnected
centers: measurement, analysis, and implications. ACM SIGCOMM distributedclouds. IEEEICC2014.
2011. [28] HA-NFVsimulatorsoftware(online):https://ha-nfv.lip6.fr.