Ensemble Methods of Classification for Power Systems Security Assessment Alexei Zhukov, Victor Kurbatsky Aoife Foley Nikita Tomin, Denis Sidorov, Daniil Panasetsky School of Mechanical and Aerospace Engineering Energy Systems Institute Queens University Belfast Russian Academy of Sciences Belfast, UK Irkutsk, Russia [email protected] {kurbatsky, dsidorov, tomin}@isem.sei.irk.ru 6 Abstract—One of the most promising approaches for complex numericalcalculationsofnonlinearcapacityequations[3, 1 technical systems analysis employs ensemble methods of classi- 4]; 0 fication. Ensemble methods enable to build a reliable decision 2 • Intelligent approaches which involve the artificial intel- rules for feature space classification in the presence of many ligence algorithms learning on a limited set of power n possible states of the system. In this paper, novel techniques a based on decision trees are used for evaluation of the reliability system states, such as artificial neural networks, support J of the regime of electric power systems. We proposed hybrid vector machine, decision trees, etc. [1, 2, 5, 19]. 7 approach based on random forests models and boosting models. An analysis of methods for the assessment of security and ] Sinuccrheasteincghnirqeuneeswacbalne bpeowaeprp,lisetdratgoepdreevdiiccetsthaendinstewriacchtiinogn ooff voltagestabilityofelectricpowersystemshowsthattheexist- I smartloadsfromintelligentdomesticappliances,storageheaters ing traditional approaches cannot be effectively applied in the A and air-conditioning units and electric vehicles with grid for online and real time conditions because of their computation . enhanced decision making. The ensemble classification methods complexity. For example load flow calculation for the assess- s c weretestedonthemodified118-busIEEEpowersystemshowing ment of the aftermath of a system component fault, which [ thatproposedtechniquecanbeemployedtoexaminewhetherthe underliestheclassicalapproachestotheassessmentofsecurity powersystemissecuredundersteady-stateoperatingconditions. 1 in electric power systems and does not seem to be fully v implemented because of complex modeling of corresponding 5 Index Terms—power system, ensemble methods; boosting; protections,thattriptheoverloadedlineorloadfeederincase 7 classification; heuristics; random forests; security assessment. of inadmissible voltage level. Moreover, currently to meet the 6 demand for electricity as well to ensure quality and reliability 1 0 I. INTRODUCTION of electricity supply systems, the distributed generation is . connected as a local energy source. 1 Assessmentofsecurityofbulkelectricpowersystemsisone Although, many of the developed approaches on the basis 0 of the pressing problems in the modern power engineering. 6 of intelligent models are more adapted to real changes in the The trends towards liberalization and the need to increase 1 powersystemtopologyandsystemconditions,theystilldonot : electricity transmission due to growing loads and generation have sufficient sensitivity to these changes, not always offer v expansion make existing power companies operate their elec- i thepossibilityofpredictingvoltagestabilitylosses,anddonot X trical networks in critical conditions, close to their admissible allowustoestimatetheprobabilityofidentifyingapotentially r security limits [1]. In such conditions the unforeseen excess dangerous state. a disturbances, weak connections, hidden defects of the relay At the transmission level (i.e. bulk power), phasor mea- protection system and automated devices, human factors as surement units (PMU’s) have been introduced to improve well as a great amount of other factors can cause a drop in grid reliability. A PMU is a calculated real time phasor mea- the system security or even the development of catastrophic surement synchronised to an absolute reference unit provided accidents. by a Global Positioning System (GPS). These PMU’s are For the time being there is a wide spectrum of approaches used to assess grid (e.g. MVARs, kV, frequency changes etc.) and tools for the assessment of security. All the variety of the conditionsbecausetheysynchronisepowerqualityinrealtime methods can be divided into: by comparing phase angle measurements. To date PMU are • Traditional approaches based on a detailed modeling only deployed for Wide Area Measurement (WAM) [20, 21]. of potential disturbances in electric power systems and One of the issues of applying and using the large amounts of PMU datasets for rapid decision making. The decision making and onus is usually still with the expertise of the This work is funded by the RSF project “Development of an intelligent gridoperators.However,asthenumberofmarketparticipants, systemforpreventinglarge-scaleemergenciesinpowersystems”undergrant No.14-19-00054. renewable power sources, storage devices and smart loads increase in the power system both at the transmission (and One of the advanced approaches to the analysis of complex distribution) levelthe decision making will becomeever more technical systems is ensemble methods of classification. They complex. Hence this research employs the ensemble methods makeitpossibletoformreliabledecisionrulesofclassification on the basis of decision trees. The calculations involved the for a set of potential system states. In this approach the following modifications of random forest models (Extremely key idea is to build a universal classifier of power system Randomized Trees, Oblique Random Forests) and boosting states which is capable of tracing dangerous pre-emergency models (Stochastic Gradient Boosting, AdaBoost). The effec- conditions and predicting emergency situations on the basis tiveness of their application is confirmed by a great number ofcertainsystemsecurityindices.Inthiscasethedetectionof of calculations on the basis of power system test scheme. The dangerous operation patterns is not effective without consid- suggestedensemblemethodsofclassificationareimplemented eringprobabledisturbance/faults,whosecalculationleadstoa in the free software environment R intended for calculations considerable increase in the computational complexity and a with an open-source code. potential decrease in the accuracy for basic algorithms. This leads to the need of finding a way to improve the accuracy II. PROBLEMSTATEMENT of the classifier of power system states. One of such methods is the creation of ensembles of the classification models and Security is an ability of electric power system to withstand their training. suddendisturbanceswithoutunforeseeneffectsontheelectric- ity consumers. It is provided by control capabilities of power One of the first most general theory of algorithmic ensem- systems. In the operational practices the required level of bleswasproposedinthealgebraicapproachbyY.I.Zhuravlev securitycanbeachievedbyboththepreventivecontrolactions [7]. According to this theory the composition of N basic (before a disturbance) and the emergency control actions algorithms ht = C(at(x)), t = 1,...,N is taken to mean (afterdisturbance).Controlinthepre-emergencyconditionsis a superposition of algorithmic operators at : X → R, of mainly a responsibility of the operational dispatching control. a correction operation F : RN → R and decision rule At the same time there can be situations where the speed of C :R→Y such as H(x)=C(F(a1(x),...,aN(x))), where power system control by the dispatching personnel appears to x∈X, X is a space of objects, Y is a set of answers, and R be insufficient to avoid dangerous situations. The complexity is a space of estimates. of a problem here lies in the fact that most of dangerous Later Valiant and Kearns [8] were the first to pose the (pre-emergency) states of electric power system which lead question about whether or not a weak learning algorithm can to large-scale blackouts are unique and there is no single be strengthened to an arbitrary accurate learning algorithm. algorithm (for solving) to effectively reveal such conditions This process was called boosting. Schapire [9] developed as the time. The problem gets complicated by the fact that the first provable polynomial-time boosting algorithm. It was thesecuritylimitofelectricpowersystemconstantlychanges, intended for the conversion of weak models into strong ones therefore fast methods for real time security monitoring are by constructing an ensemble of classifiers. The main idea of requiredtoanalyzethecurrentlevelofsecurityandaccurately the boosting algorithm is step-by-step enhancement of the tracethelimitanddetectthemostvulnerableregionsalongit. algorithm ensemble. One of the popular implementations of The key idea of the pre-emergency control concept is the this idea is Schapires AdaBoost algorithm which involves factthatthevoltageinstabilityfollowingtheemergencydistur- ensemble of decision trees [10]. bance which accompanies many system emergencies does not Another approach to the classification and regression prob- developasfastasthedynamicinstabilityoftheelectricpower lems using the ensembles was suggested by Leo Breiman system[6].Thus,whenthephaseofslowemergencydevelop- [11]. This approach is an extension of the bagging idea. mentcomes,thebalancebetweengenerationandconsumption According to this idea, a collective decision can be obtained is maintained for a long time and this makes it possible to by using an elementary committee method which classifies detect potentially dangerous states, which appear after the an object according to a decision of most of the algorithms. disturbance,andgeneraterespectivepreventivecontrolactions Unlike the boosting method bagging is based on parallel [2]. learning of base classifiers. One of the progressive bagging- basedapproachesisthemethodcalledRandomForest[11,12]. III. ENSEMBLEALGORITHMSOFCLASSIFICATIONAND Later there appeared the most effective modifications of both THEPROBLEMOFPOWERSECURITYASSESSMENT Random forests and boosting algorithms such as Extremely A great many studies show that the effective solution to Randomized Trees, Oblique Random Forests and Stochastic this problem can be found on the basis of machine learning Gradient Boosting [13]. methods which normally include artificial neural networks, In the researches devoted to the security assessment there decision trees, ensemble (committee) models, etc. This is are many approaches oriented to the construction of models related to their capabilities of fast detection of the images, on the basis of decision trees [1, 2, 5,13]. These models patterns (i.e. typical samples), learning/generalization and, use both off-line (periodically updated) and on-line methods. which is important, high speed of identifying the instability Singletreesareeasilyinterpretable,yetdonotalwaysallowus boundaries. to obtain the required accuracy when approximating complex target relationships. Therefore, it is considered reasonable to use compositions [14]. SYSTEM LOADING PROCEDURE Topology Load/generation models IV. CALCULATIONOFAPOWERSYSTEMSECURITYON THEBASISOFENSEMBLEMODELS VALIDATION Loadflow computation Figure 1 presents a general scheme of the suggested ap- Secondary voltage control DATA BASES HV shunt compensation proach for the estimation of power system security. The pri- Normal and heavy load Pre-disturbance maryprincipleoftheapproachliesinthemathematicalmodel operating states Attribuftloews (sv, ololtaadgse…s,) power learning on the basis of the ensemble method of classification Security classes to automatically make a sufficiently accurate assessment of DISTURBANCES SIMULATION Just after disturbance the power system conditions according to the criterion se- Apply a random disturbanse (t=0) Attributes (voltages, power Time domain simulation flows, loads…) cure/insecureonthebasisofsignificantclassificationattributes ofapowersystemstate,forexampleactiveandreactivepower CLASS («normal», «alarm», flows, bus voltage, etc. A great amount of such attributes CLASSIFICATION «emergency», «in-extreme») Compute security index are obtained on the basis of randomly generated data sample consisting of a set of really possible states of electric power system [2]. Figure2. Aschemeofmodelingcalculatedstatesofelectricpowersystem intheproblemofsecuritymonitoringandassessmentonthebasisofquasi- dynamicmodelingintheMATLAB/PSATenvironment. Modeling calculated Ensemble classification method power system states (software environment R ) (MATLAB/PSAT environment) sTarmaipnlien g1 Classifier 1 In order to obtain a problem book each of such states is assigned a security index (here readers may refer to [14]) MDOQYDUNEAALSMIINI-CG ODPAPSETTORAOAWA TFBTEE AIRSNS GE ssTTaarrmmaaiippnnlliieenn gg23 CCllaassssiiffiieerr 23 CPLOAWCSOSETIRMOF IOSBCYIALNSTTEIEO MN SeIncduerixty which isScaIlc=ulawte1d·(cid:80)byni=tLh1eLfOolIloiw+inwg2e·x(cid:80)preni=sBs1ioVnD: Ii, (1) n +n L B Training Classifier wherew1 andw2 aretheweightingfactorsofsystemsecurity; sample N N LOI islineoverloadindex;VDI isindexofvoltagedeviation at nodes; n and n represent the number of lines and buses L B respectively. Figure 1. A general scheme of the assessment of potential power system Thus, the security index is determined by calculating the security,usingcompositionalmodels. VDI and LOI, which are obtained using the following expres- sions: Depending on the ensemble method applied each deci- sion rule will be trained by its subsampling according to (cid:40) the bagging and boosting principles. The final decision on LOIkm = SkmS−kmSlim ·100, if Skm >Slim; (2) the classification of any power system state is made within 0, if Skm <Slim. the generalized classifier according to different principles of smimosptlecommapjeotreintyt dveoctiinsigo,nwrueilge.hted voting or by choosing the |Ukm|Uinkm|−in||Uk| ·100, if |Uk|<|Ukmin|; In the paper a list of potential power system states for the VDIk = 0, if |Ukmin|≤|Uk|≤|Ukmax|; model learning is formed using quasi-dynamic modeling with |Uk||U−m|Uakxm|ax| ·100, if |Uk|>|Ukmax|, a special program in the MATLAB environment (Fig.2). The k (3) loadmodelwasrepresentedbystaticcharacteristicsdepending where Skm and Slim represent the MVA flow and MVA limit on voltage. When critical values of voltage are achieved the of branch k-m correspondingly, |Uk|, |Ukmax| and |Ukmax| are load is automatically transferred to shunts. The method of a the minimum voltage limit, maximum voltage limit and bus proportional increase in load at all nodes of the test scheme voltage magnitude of k-bus respectively. was optimized for the security analysis in such a way that the Evaluating the security index as given by (1), each pattern initial condition for each emergency disturbance is a stable is labeled as belonging to one of the four classes as shown in condition closest to it, from those calculated. Thus, at each Table IV. stage of an increase in the test scheme load the emergency To give an idea of the criteria without going into details, events (primary disturbances) are randomly modeled by the the system states are presented. N−1 reliability principle. As a result, the database including • Normal state implies that all parameters of the power a set of various pre-emergency and emergency states of the system are maintained within specified normal operation test scheme is built. limits. SecurityIndex ClassCategory/PowerState from the SCADA system or the Phasor Data Concentrator SI=0 Normalstate 0<SI≤5% Alarmstate system in the power system state. 5%<SI≤15% Emergency1state SI>15% Emergency2state B. Estimating Performance For Classification TABLEI. CLASSLABELSFORPOWERSECURITYANALYSIS. In current paper we need to use proper performance mea- surement metrics for classification problems. We used the following metrics: • Alarm state that some of the system parameters exceed • The overall accuracy of a model indicates how well the the specified normal limits (for example, bus voltage can model predicts the actual data. exceed 5%, but remain within 10%). Depending on the • the Kappa statistic takes into account the expected error operationrules,actionscantakeplacetobringthesystem rate: O−E to the normal state. k = (4) • Emergency1stateimpliesthesystemisstillintact;how- 1−E ever, some system constraints are violated. The system where O is the observed accuracy and E is the expected can be restored to the normal state (or at least to the accuracy under chance agreement. alarm state), if the suitable corrective actions are taken. C. Overview of obtained results • Emergency 2 state implies that the current situation cannot be corrected and will lead to major emergency. Ensemble and single trees methods have been built for Control actions, like load shedding or controlled system classifying the power system states, for various candidate separation are used for saving as much of the system as attributes and four different security classifications. Tab. II possible from a widespread blackout. showscomparisonofaccuracyachievedbythestate-of-the-art classification tree learning algorithms. Namerly, the follow- Inthecasewherethevaluesoftheindicesexceedthespec- ing classifiers where tested: J48 decision tree, conventional ified limits on security and the high probability of emergency Breiman’s non-parametric decision tree learning technique situationsthatcorrespondtothesevalues,respectivepreventive CART, bagged CART (BCART), Random Forest, Extra Trees or emergency control measures can be formed. (ET) and Stochastic Gradient Descent (SGB) method. The V. CASESTUDY ensemble and single tree methods were trained on 6877 The feasibility of the approach in a proof-of-concept has samplesdatasetandtestedon1715samples.Confusionmatrix been demonstrated on the IEEE 118 power system consisting is shown in Tab. I. Each column of the matrix represents the of more than 118 buses, 54 generators, and 186 transmission instances in a predicted class while each row represents the lines.Anopen-sourceenvironmentR[16]withcaretpackage instances in an actual class [17] is used as a computing environment for proposed models Metrics EnsembleMethods design and testing. J48 CART BCART RF ET SGB Accuracy% 99.83 99.07 99.88 100.00 100.00 100.00 A. Data base generation Kappa% 99.72 98.52 99.81 100.00 100.00 100.00 Operating conditions are all generated using the Powertrain TABLEII. CLASSIFICATIONACCURACYCOMPARISON. SystemAnalysisToolkit(PSAT)[18].Thedatabasegeneration and data conversion are conducted in MATLAB. The overall data base generation procedure, whose aim was to provide a alarm emergency1 emergency2 normal classerror representative sample of possible power system states, com- alarm 947.00 1.00 0.00 0.00 0.00 emerg1 4.00 279.00 5.00 0.00 0.03 bining various prefault operating states and disturbances, is emerg2 0.00 4.00 252.00 0.00 0.02 illustrated in Fig. 2. normal 2.00 0.00 0.00 225.00 0.01 A data base representative of the potential power system states was obtained by generating, firstly, a sample of various TABLEIII. CONFUSIONMATRIX. prefault situations, and applying to each state the random disturbances to produce the corresponding stability scenarios. Fig. 3 shows testing errors with respect to the number of To obtain the data base composed of 6877 states, each of the treesfornormal,alarm,emergency1andemergency2cases.As prefault normal or heavy load states was combined with the footnote its to be noted that all obtained accuracy values are possible disturbances. These have been simulated with a vari- close. But some of the models enjoy additional useful proper- ablestepMATLAB/PSATquasi-dynamicsimulationprogram, ties. For example Extra Trees needs less memory comparing whichcomputedtheattributevaluesandallowedustoclassify withclassicalRandomForestbutcomparablewithStohastical based on the security index the scenarios as normal, alarm, gradient boosting. emergency 1 and emergency 2. The 490 initial candidate Fig. 4 shows variable importance for all classes obtained attributes such as active and reactive power flow, voltages by computing of mean gini index decrease. The classification used to characterize the power system states. They represent treesselectvoltagesundernormalstatesasthemostimportant essentially power system quantities which may be available attributes for security monitoring and assessment. It may be * Test 5 * alarm * emergency1 * emergency2 * normal 4 % Error, 3 2 1 0 0 20 40 60 80 100 Number of trees Figure3. Testingerrorwithrespecttothenumberoftrees. Figure4. Variableimportanceforallclassesobtainedbycomputingofmean explained by the fact that the voltage sag observed in the giniindexdecrease. powersystemstatereflectsproportionalincreaseinload,when the static characteristics of the load model depend on voltage. Underalarmandemergencystatesthepowerandreactiveflow [3] A. B. Osak and A. I. Shalaginov, Methods for rapid analysis in the attributes were selected in preference to voltages. A possible problem of security assessment based on short-term forecasting system explanation lies in the fact that this security criterion is more behavior,Saint-Petersburg,2014.(inRussian). [4] Methods and models for power system reliability studies, Syktyvkar: preventive like. KomiScientificCenterofUralBranchofRAS,2010,292p.(inRussian). Wealsodemonstratedthefeasibilityofdealingwithincom- [5] R. Diao et al. Decision tree-based online voltage security assessment plete and distorted data. Taking into consideration SCADA using PMU measurements, IEEE Trans. Power Syst. IEEE, 2009, Vol. 24,No.2,pp.832-839. malfunctions, the corrupted patterns were used to train en- [6] M. Negnevitsky, N. Tomin, D. Panasetsky, N. Voropai, Ch. Rehtanz, semble classification trees. The results showed that the test U. Haeger, V. Kurbatsky, Intelligent system for preventing large-scale error rate did not changed even if 50% of gaps (Tab. IV). emergenciesinpowersystems,Electrichestvo,No.8,pp.1-8,Aug.2014 (inRussian) [7] Yu. I. Zhuravlev, On the algebraic approach to solving the problems of %ofgaps timeinsec. testerror,% recognitionandclassification,Problemykibernetiki,Moscow,1978,Vol. 10 0.0123 0.93 33,pp.5-68. 30 0.0411 0.93 [8] M. Kearns, L. Valiant, Cryptographic limitations on learning Boolean 50 0.0514 0.93 formulaeandfiniteautomata,J.ACM-ACM,1994,Vol.41,No.1,pp. TABLEIV. FILLINGTHEGAPSINDATA 67-95. [9] R.E.Schapire,Theboostingapproachtomachinelearning:anoverview, InNonlinearEstimationandClassification,Springer,2003,Vol.171,pp. 149-171. VI. CONCLUSION [10] Y.Freund,R.E.Schapire,Experimentswithanewboostingalgorithm, The ensemble classification methods were tested on the Proc.ICML,1996,Vol.96,pp.148-156. [11] L.Breiman,RandomForrest,Mach.Learn.,2001,pp.1-33 modified IEEE 118 power system showing that proposed [12] S. P. Chistyakov, Random Forest:Overview, Proc. of Karel Scientific technique can be employed to examine whether the power CenterofRAS,No.1.2013.pp.117-136. systemissecuredundersteady-stateoperatingconditions.The [13] M.Sadeghi,M.A.Sadeghi,S.Nourizadeh,A.M.RanjbarandS.Azizi, PowerSystemSecurityAssessmentUsingAdaBoostAlgorithm,InPros. experimental studies showed that the ensemble methods can oftheNAPS’09,Starkville,MI,2009. identify key system parameters as security indicators with [14] A.Saffarietal.On-lineRandomForests,2009. highaccuracyand,ifrequired,theobtainedsecuritytree-based [15] S. Kalyani, K. Shanti Swarup. Design of pattern recognition system model can produce an alarm for triggering emergency control for static security assessment and classification, Pattern Analysis & Applications,Aug.2012,Vol.15,pp.299-311. system. The next stage of this work will involve taking real [16] R:ALanguageandEnvironmentforStatisticalComputing,TheRCore power system data and modelling multiply decision making Team,version3.1.3(2015-03-09). with many grid participants. [17] M.Kuhm,J.Wing,S.Westonetal.caret:ClassificationandRegression Training,2015,RPackageversion6.0-47. REFERENCES [18] F. Milano. Power System Analysis Toolbox. Documentation for PSAT, UniversityCollegeDublin,2014,version22.1.9. [1] D. Panasetsky, D. Tomin, N. Voropai, V. Kurbatsky, A. Zhukov, [19] D.Sidorov.IntegralDynamicalModels:Singularities,SignalsandCon- D. Sidorov, Development of software for modelling decentralized intel- trol. World Scientific Series on Nonlinear Sciences. Vol. 87, Singapore: ligentsystemsforsecuritymonitoringandcontrolinpowersystems,In WorldSc.Pub.,2015. Proc.ofPowerTechConf.,IEEEPES,Eindhoven,June29–July22015, [20] J. B. A. London, S. A. R. Piereti, R. A. S. Benedito, N. G. Bretas. pp.1-6. RedundancyandObservabilityAnalysisofConventionalandPMUMea- [2] L. Wehenkel, Machine Learning Approaches to Power System Security surements,IEEETransactionsonPowerSystems,2009,Vol.24,Issue3, Assessment.Dissertation,UniversityofLiege,1995. pp.1629-1630,DOI:10.1109/TPWRS.2009.2021195. [21] P.Gopakumar,M.J.B.Reddy,D.K.Mohanta.Transmissionlinefault detection and localisation methodology using PMU measurements, IET on Generation, Transmission & Distribution,2015, Vol. 9, Issue 11, pp. 1033-1042,DOI:10.1049/iet-gtd.2014.0788.