RESEARCHARTICLE Combining a Spatial Model and Demand Forecasts to Map Future Surface Coal Mining in Appalachia MichaelP.Strager1*,JacquelynM.Strager2,JeffreyS.Evans3,JudyK.Dunscomb4,Brad J.Kreps4,AaronE.Maxwell5 1 DivisionofResourceManagement,WestVirginiaUniversity,Morgantown,WestVirginia,UnitedStatesof America,2 NaturalResourceAnalysisCenter,WestVirginiaUniversity,Morgantown,WestVirginia,United StatesofAmerica,3 TheNatureConservancy,FortCollins,Colorado,andDepartmentofZoologyand Physiology,UniversityofWyoming,Laramie,Wyoming,UnitedStatesofAmerica,4 TheNature Conservancy,Charlottesville,Virginia,UnitedStatesofAmerica,5 AldersonBroaddusUniversity,Philippi, WestVirginia,UnitedStatesofAmerica * [email protected] Abstract OPENACCESS PredictingthelocationsoffuturesurfacecoalmininginAppalachiaischallengingforanum- berofreasons.Economicandregulatoryfactorsimpactthecoalminingindustryandfore- Citation:StragerMP,StragerJM,EvansJS, DunscombJK,KrepsBJ,MaxwellAE(2015) castsoffuturecoalproductiondonotspecificallypredictchangesinlocationoffuturecoal CombiningaSpatialModelandDemandForecaststo production.Withthepotentialenvironmentalimpactsfromsurfacecoalmining,predictionof MapFutureSurfaceCoalMininginAppalachia.PLoS thelocationoffutureactivitywouldbevaluabletodecisionmakers.Thegoalofthisstudy ONE10(6):e0128813.doi:10.1371/journal. pone.0128813 wastoprovideamethodforpredictingfuturesurfacecoalminingextentsunderchanging economicandregulatoryforecaststhroughtheyear2035.Thiswasaccomplishedbyinte- AcademicEditor:JuanA.Añel,Universidadede Vigo,SPAIN gratingaspatialmodelwithproductiondemandforecaststopredict(1km2)griddedcellsize landcoverchange.Combiningthesetwoinputswaspossiblewitharatiowhichlinkedcoal Received:September19,2014 extractionquantitiestoaunitareaextent.Theresultwasaspatialdistributionofprobabili- Accepted:April30,2015 tiesallocatedoverforecasteddemandfortheAppalachianregionincludingnorthern,cen- Published:June19,2015 tral,southern,andeasternIllinoiscoalregions.Theresultscanbeusedtobetterplanfor Copyright:©2015Strageretal.Thisisanopen landusealterationsandpotentialcumulativeimpacts. accessarticledistributedunderthetermsofthe CreativeCommonsAttributionLicense,whichpermits unrestricteduse,distribution,andreproductioninany medium,providedtheoriginalauthorandsourceare credited. Introduction DataAvailabilityStatement:Dataareavailable throughFigshareat:http://dx.doi.org/10.6084/m9. TheAppalachianregionoftheeasternUnitedStatesisanimportantsourceoffossilfuelto figshare.1411220. meetenergyneeds.Withintheregion,surfaceproductionofcoalaccountsfortwothirdsof Funding:MPS,JMSandAEMweresupportedwith totalproduction,whileundergroundminingcontributesaboutonethirdoftotalproduction fundingfromTheNatureConservancycontract [1].Regionalcoalresourcesincludesteamcoalusedinelectricpowergeneration,and(toaless- numberCVP_11132012.Thefundershadarolein erextent)metallurgicalcoalusedinindustrialprocesses. selectingthestudyarealocationandtechnical TheoverallfutureofAppalachiancoalresourceextractionisincreasinglyuncertain.Thereis specificsregardingtheminimalmappingunitusedin acomplex,dynamicrelationshipbetweenthepriceofcoal,thepriceofcompetingresources(in thestudy.Allotherdecisionsregardingthemodeling methodweredecidedbyMPS,JMSandAEM. particularnaturalgas),andpotentialgreenhousegasemissionreductionpolicieswhichreduce PLOSONE|DOI:10.1371/journal.pone.0128813 June19,2015 1/25 PredictingFutureSurfaceCoalMining CompetingInterests:Theauthorshavedeclared thedemandforcoal.Coalissubjecttoincreasedcompetitionfromnaturalgasasasourceofen- thatnocompetinginterestsexist. ergyforelectricitygeneration,andmaybeequaledorsurpassedbynaturalgasinthenearfuture dependingonoilandgasprices,greenhousegasrelatedpolicies,coalproductioncosts,and otherfactors[2].Coalproductionisalsoshiftinggeographicallywithintheregion,asdemand forcleaner-burning,lowersulfurcoalhasrisenduetoincreasedenvironmentalregulation. EvenwithcoalpredictedtoplayasmallerandsmallerroleinAmerica’senergymixinthe future[3],theneedexiststomodelandspatiallypredictwheresurfacecoalminingisanticipat- edinAppalachiaduetothepotentialenvironmentalimpactsofsurfacecoalmining.Many studieshavedocumentedtheimpactsofcoalminingonbiodiversity[4,5],hydrology[6–8], humanhealth[9],andwaterquality[10–12].Theadditiveeffectshavebeenexaminedrelated tohowmultiplesurfaceminescanimpactstreams[13,14]andhowtheimportanceofspatial locationandnetworkpositionwithotherpreexistingfactors(othersurfacemines,deepmines, andresidentialdevelopment)cancontributetoecologicalstress[15–17].Bybetterpredicting probableareasforsurfacecoalextraction,thepotentialenvironmentalimpactsonsensitive ecosystemscanbeidentifiedandcontextdependentconservationprioritiescanbesetincom- plexriversystems[18]. Thisstudyprovidesamethodforpredictingfuturesurfacecoalminingextentsbyintegrating aspatialmodelwithproductiondemandforecaststobetterrepresentlandcoverchange.By combiningthesecomponents,amoreholisticpredictioncanbemade.Thishasonlybeenrecent- lypossibleduetoeffortsthatquantifiedthearealextentofsurfacecoalminingactivitiestocoal production[19].Thisenabledustocombinevaryingestimatesofsurfacecoalmineproduction [2]withspatiallyexplicitpredictivemodelingtomappotentialfuturesurfaceminingfootprints onthelandscapethroughthefuture.Wedemonstratehowtheextentofsurfaceminingcanbe predictedwiththisapproachandcomparetheresultstoactualrecentlysubmittedpermits. Toourknowledge,themostcloselyrelatedeffortrelatedtoourworkwasbyWatson[20] whomappedremainingcoalreserveswithhighmarketpotentialinthePittsburghcoalbed. Ourapproachadvancestheeffortinthreedistinctways.Thefirstisthatthescaleofourstudy iswiderinscopethanasinglecoalseam.WeincludedtheentireAppalachianregionwhich coversnorthern,central,southern,andtheeasternIllinoissubbasins.Becauseofthiswewere abletocreateaforecastoffuturecoalminingwithaspecificfocusonsurfaceminingthrough- outtheAppalachianarea.Wealsofocusedonsurfaceminingratherthanvariousunderground techniques(suchaslongwallunderground,roomandpillar).Forthepast10–15years,surface mininghasbeenamorecommonpracticeacrossAppalachiaduetoexpansionofmountain topremovaltechniques[21].Throughouttheregiontheremainingcoalseamsareoftentoo thinordeeptoundergroundmineandtheunconsolidatedoverlayingrockmakestheroof weakforundergroundminingtooccursafely[21].Second,ourapproachincorporatesfore- casteddemandscenariosintothemodelpredictionwhichwerelinkedtoasurfaceareaextrac- tionratiobyregion.Thesetwodevelopmentsenabledustospatiallyallocatethedemand acrosstheregionunderdifferentscenarios.Andthird,whilewewouldhavebenefittedfrom isopachsofcoalbedthicknessforallthecoalseamsthroughoutourregion,manyofthephysi- calpropertydatasetswecreatedasmodelvariableswerelocallysampledvalueswhichweinter- polatedwithgeostatisticalmodelingtocreateregionaldatasets.Thisenabledustouselocally sampledinputdatatocreateinterpolateddatasetstoaddressthemainresearchquestionof wherefuturesurfacecoalminingwaslikelytooccurthroughouttheregionalextent. Methods Anoverviewofthemethodologywhichincludesthespatialdatacollectionandinputdatasets forthepredictivemodelofthisstudyareprovidedinFig1. PLOSONE|DOI:10.1371/journal.pone.0128813 June19,2015 2/25 PredictingFutureSurfaceCoalMining Fig1.Projectmethodologyflowchart. doi:10.1371/journal.pone.0128813.g001 Themethodologyforthisstudyincludeddefiningappropriatepredictorvariables,running aRandomForests[22]spatialmodelandperformingpredictivemappingbyallocatingproduc- tionforecastsforafuturesurfacedminefootprint.Afundamentalfirststeptothiseffortwasto selectthoselandscapepredictorvariableswhichcanbeusedtoeffectivelymodelthelocations offuturesurfacecoalmining. Predictorvariables Variablesincludedphysicalpropertiesofthecoalresource(coalgeologytype,sulfurcontent, ashcontent,andBTUs),andinfrastructurerelatedpredictors(networkdistancetoexisting coalfiredpowerplants,networkdistancetointermodaltransportationfacilities,networkdis- tancetoinlandports,distancetorail,humanpopulationdensity).Allvariableswererepre- sentedasrasterdatamodelswithacellsizeof1km2usingESRIArcGIS10.1software[23], withanalysisextentlimitedtothecoalgeologyextentwithintheAppalachianLandscapeCon- servationCooperative(LCC)[24].Fordistancerasters(distancetopowerplants,distanceto railroadsetc.)distanceswerecalculatedtofeaturesoutsidetheAppalachianLCCpriortolimit- ingrasterstothestudyareaboundary.AsummarylistofpredictorvariablesisshowninFig2. Coalgeologytype. Generalizedcoalfieldboundarieswerederivedfromamapofcoal fieldsoftheUnitedStatesata1:5,000,000scale[25].Generalizedcoalfieldsincludeareaswith knowncoal-bearinggeology,andwereusedtolimittheextentofpredictedfutureminingprob- abilitywithinthestudyarea(futureminingwaslimitedtoareaswithinmappedcoalfields). Withinthiscoalfieldboundary,wealsoobtainedstatelevelgeologicmapsfromdatasets compiledbyUSGSforU.S.states[26].Thegeneralizedstatelevelgeologicmapswereclassified intogeologicunitscontainingcoal,andthosewithoutcoal.Finally,thegeologicunits PLOSONE|DOI:10.1371/journal.pone.0128813 June19,2015 3/25 PredictingFutureSurfaceCoalMining Fig2.Summarylistofpredictorvariables. doi:10.1371/journal.pone.0128813.g002 containingcoalwerefurthercross-referencedinto17differentgeologicalunitsregionwide basedongeneralizedlithologyandformation.Thecrossreferencingprocesswasnecessarydue toinconsistenciesandlabelingamongthedifferentstates.Thiswascompletedusingachronos- tratigraphiccorrelationchart[27].Formationsweregroupedbasedongeologicagetoproduce 17finalmappedcategoriesofsimilarlithologythatarenotimpactedbystateboundaries. Sulfurpercentageofcoal. Thesulfurcontentofcoalisoneaspectofcoalqualitywhich wasimportanttocharacterizeasamodelvariable.Restrictionsonsulfurdioxideemissions frompowerplantshavemadetherelativesulfurcontentofcoalanimportantconsiderationin theeconomicviabilityofdifferentcoalresources(withlowsulfurcoalgenerallybeingmorede- sirable).Thepercentageofsulfurcontentinthecoalwasinterpolatedusingboreholedatafrom theUSGSCoalQualitydatabase[28].Priortointerpolation,boreholedatawerelimitedto samplestakenatthesurface(undergroundordeepminesampleswereexcluded).Under- groundandboreholesamples(excluded)wereidentifiedbysampledepthvaluesand/orde- scriptivetextinthecommentsfieldinthesampledatabase.Surfacesampleswerealso identifiedbyvaluesinthecommentsfieldindicatingsamplesweretakenatroadcuts,pits,and stripmines[28].Whiledifferentcoalseamsmaybeencounteredwitheachoftheborehole sites,anoverallsulfurpercentageisassumedforeachsite. Theinterpolationprocessforsulfur,ash,andBritishThermalUnit(BTU)followedstan- dardgeostatisticalkrigingsteps[29].Theyincludedfirstexploringthedatafornormality,ex- aminingtrendsandthesemivariogram,andtestingmodeloutputrunsuntilasatisfactoryroot meansquarederrorandmeanstandardizederrorfromthecrossvalidationpredictionerrors werefound. Forsulfur,anordinarykrigingmodelwasappliedandanisotropyexaminedtoaccount fordirectionalinfluences.Thiswasusefulespeciallysincethecoalgeologyfollowsridgeandto- pographicalfeatures.Atotaloftenlagswereappliedwithasizeof20,000tobestfitthedistri- butionoftheinputpointlocations.Thesearchneighborhoodwasstandardsizedwitha maximumof5neighbors.Theresultsforsulfurcrossvalidationindicatedanaccuratepre- dictedsurfacewitharoot-mean-squarestandardizedpredictionerrorof1.009(avaluecloser to1.0ispreferred[29]). Ashcontentofcoal(ashyield). Ashcontentofcoalisalsorelatedtorelativecoalquality. Ashcontentisrelatedtotheportionofcoalthatremainsaftercombustion.Ashyieldwasalso PLOSONE|DOI:10.1371/journal.pone.0128813 June19,2015 4/25 PredictingFutureSurfaceCoalMining obtainedfromtheUSGSCoalQualitydatabase[28]andwasalsointerpolatedusingmethods similartothoseusedforsulfurcontent. Forash,againordinarykrigingwasappliedwithanisotropyexaminedforthedirectionalin- fluenceswhichindicatedanimprovedfitwithanangleof44.6and45tolerance.Thelagsused weredifferentforash–12totalwithalagsizeof12,000.Thesearchneighborhoodwasstan- dardsizedwithamaximumof5neighborsaswithsulfur.Theresultsforsulfurcrossvalidation indicatedanaccuratepredictedsurfacewitharoot-mean-squarestandardizedpredictionerror of1.001. BTUcontent. BTUcontentofcoalisrelatedtotheamountofenergyprovidedbyagiven amountofcoal.BTUcontentofcoalperlb.wasderivedfromtheUSGSCoalQualitydatabase [28]usingmethodssimilartoashandsulfurcontent. FortheBTUinterpolation,asimplekrigingmodelwasappliedwithalogscoretransforma- tiontomakethevariancesmoreconstantthroughoutthestudyareaandbringthedatacloser tobeingnormallydistributed.Anisotropywasappliedtoaccountfordirectioninthesemivar- iogramandcovariance.Thepreferredanglewas32witha21.4degreetolerance.Twelvelags withasizeof15,000wasfoundtofitthemodelbestwiththeaverageddatapoints.Againhere, thestandardneighborhoodsearchwasusedwithamaximumof5neighbors.ThefitforBTU wasnotaswellasashandsulfurwitharoot-mean-squarestandardizederrorof0.887. Distancetocoalfiredpowerplants. Existingcoalfiredpowerplantswereidentifiedusing informationpublishedbytheU.S.EnergyInformationAdministration,basedonformEIA- 860AnnualElectricGeneratorReport[30].Thelocationsweredeterminedusinglatitude/lon- gitudecoordinatesprovidedbySourceWatch[31]andshapefilesprovidedbyEnergyInforma- tionAdministration[32].Weidentifiedatotalof318existingpowerplantsasof2011.We thenremovedatotalof92oftheseplantsthatarescheduledforclosurebetween2013and 2020[31].Anadditional25newcoalfiredfacilities(includingpowerplants,cogenerationfacil- ities,coaltoliquidsplants)wereaddedtothefinaldatasetthatareproposed,planned,inper- mitting,orunderconstructionforthisareaasnotedbySourceWatch[31],theSierraClub [33],andNationalEnergyTechnologyLaboratory[34].Forourfinalpredictorvariable,wecal- culateddistancealongahighwaynetwork[35]to251coalfiredpowerplantfacilities(226ex- isting,25new).Distancealongthehighwaynetworkwasinitiallycalculatedalong1km2cells alongtheactualhighways,andwasthenextrapolatedouttocoverallcellswithintheAppala- chianLCCusinganinversedistanceweightedinterpolator. Distancetointermodaltransportationfacilities. Intermodaltransportationfacilitiesare locationswherefreightmaybetransferredbetweendifferentmodesoftransportation(i.e. trucktobarge,trucktorail,etc.).Intermodalfacilitypointlocationswereobtainedfromthe NationalTransportationAtlasDatabase,andwerethenlimitedtoallfacilitiesexceptports, whichweremappedseparately[36].Distancetointermodalfacilitieswasmappedalongthe highwaynetwork,thenextrapolatedouttoallcellswithintheAppalachianLCC. Distancetoinlandports. InlandriverportswerealsoobtainedfromtheNationalTrans- portationAtlasDatabase[36]andwerelimitedtothoseportshandlingcoalandcoalrelated commodities.Distancetoportswasmappedalongthehighwaynetwork,thenextrapolatedout toallcellswithintheAppalachianLCC. Distancetorail. AccordingtoU.S.EnergyInformationAdministrationdomesticcoaldis- tributionstatistics,56%ofcoalproducedbythetencoal-producingstatesinthestudyareawas distributedusingrailin2011[32].Inaddition,atotalof29%ofcoaldistributeddomestically wasmovedbyriver(barges),withatotalof13%wastransportedbytruck.Thisimpliesthat proximitytorail,river,andtruckingrelatedloadingfacilitiesmaybeanassetinlocationofpo- tentialminingactivity.Miningrelatedfacilities(forloadingcoalontorailcars)arenotneces- sarilylimitedtolocationsatendpointsofraillines.Mineloadingfacilitiescanalsobefoundat PLOSONE|DOI:10.1371/journal.pone.0128813 June19,2015 5/25 PredictingFutureSurfaceCoalMining anypointalongraillines,notjustattheendpointsoratspurs.Mappingdistancetoexisting railscapturesmorepotentiallocationsforaccesstoraillinesfromcoalminingpermitloca- tions,ratherthanlimitingtherailfeaturedatasettoendpointsonlyofexistingrailroads. LocationsofrailroadswereacquiredfromtheBureauofTransportationStatisticsU.S.Na- tionalTransportationAtlasrailroadslayer,atthe1:100,000mapscale[37].Distancetonearest raillinewasmappedasEuclideanstraightlinedistanceacrosstheAppalachianLCC(notlimit- edtodistancealongnetwork). Populationdensity. Populationdensitywascalculatedacrossthestudyareausing2010 Censusblockgroupdata,andwasthenconvertedtorasterformat,1km2cellsize[23]. Otherdataconsiderations. Economicfactorsareimportanttoconsiderforfuturemining sincedevelopmentdecisionshavecostsultimatelybuiltintothedecisionprocess.Mostofthe economicvariablesforminingarerelatedtothedepositgeometrystrippingratio,size,shape, anddepthofstrikeofdeposit,rockconditions,productivitiesandmachinerycapacitiesaswell assomeofthemorecommoneconomiccostsrelatedtocapitalrequirementsandoperating costs,discountrate,investments,amortization,depreciation,recoveriesandrevenues,labor forceavailability,andenvironmentalregulations[38].Otherfactorsconsideredtobeimpor- tantfornewsurfaceminingactivityincludedpastandexistingmining,strippingratios(over- burden,coalbedthickness),coalreservesremaining,surfaceownershippatterns,andcoal qualityasrelatedtomarketdemand.Eachofthesefactorswerespecificallymentionedbyinter- nalreviewersinvariousstagesofthisproject,andwerealsomentionedintheEnvironmental ImpactStatementformountaintopremovalminingintheAppalachianregion[39].Ultimate- ly,thesefactorswerenotincluded(directly)inthefinalmodelingprocess,afterinvestigationof availabledatasetsanddataquality.Locationandextentofpastminingwerenotuniformly availablefortheentirestudyarea,asminingdatasetsfromindividualstatesvariedgreatlyin quality.Datarelatedtostrippingratios(overburden,seamthickness)wereavailableforsome coalseams[40]andstatesinthestudyregion(Illinois[41];Indiana[42];WestVirginia[43]; Virginia[44]butnotothers.Remainingcoalreservesareavailableonacounty-by-countybasis forsomestates(see[45]forexample)oronaregionallevelfromtheU.S.EnergyInformation Administration,butreservedataarenotconsistentlypublishedatadetailedenoughspatial scalefortheregioninordertobeincludedintheproject.Thefocusofourmodelingonsurface miningactivityonly(ratherthansurfaceandundergroundcombined)alsoplacedmoreimpor- tanceonoverburdencoalamountsaswellasaccessibilityfromthesurface. Forsurfacelandownershippatterns,ithasbeensuggestedthatthedifferingnatureofland ownershipamongstatesmayberelatedtosurfacemining–specificallythatsurfaceminesof easternKentuckyarecharacterizedbysmallerlandowners,whilesurfaceminesinneighboring southwesternWestVirginiaaremorelikelytobeownedbylargercorporatelandowners[39]. Basedonaquickcrossreferencewithexistingpermitdata,wedidnotfindthistoexistasthe averagepermitsizeinKentuckywaslargerthantheaveragepermitsizeforWestVirginia.In anycase,landownershipdataforsuchalargestudyregionisnearlyimpossibletoassemble, particularlyinlightoftherelativelycoarsespatialscaleofthiswork(1km2cellsize).Wealso didnothaveaccesstoadequatemineralrightsdatafortheentirestudyregion,anotherimpor- tantconsideration.Whilethesedatalimitationsprecludeourabilitytomakethesamelocalde- cisionsacoalcompanywouldmakeforasite,thegoalofthisprojectwastofocusonbroader regionalpredictionsandforecasting. Activesurfaceminepermitlocations Thepreviouslylistedindependentvariableswereanalyzedwiththedependentvariableofloca- tionofactivesurfaceminepermits.Thecentroidsofeachpermitwerecalculatedforthemodel PLOSONE|DOI:10.1371/journal.pone.0128813 June19,2015 6/25 PredictingFutureSurfaceCoalMining runs.Surfaceminingpermitlocationswereobtainedfromindividualstateagenciesfortheten coal-producingstateswithinthestudyarea.Miningpermitswerefurtherlimitedtoactivesur- faceminingpermitsonlybyexcludingundergroundminesandpermitsassociatedwithinac- tiveorhistoricalmines.Incertainstates,ifpermitstatus(active/inactive)wasnotindicated, permitswerelimitedtothosewithdatesfromtheyear2000tothepresentonly,inanattempt tolimitanalysistocurrent,activemines. Exclusionareas Inadditiontotheabovementionedpredictorvariablesandsurfaceminepermitlocationswe alsointegratedspatialdatasetsas“exclusions”orareaswheresurfacecoalminingcouldnot occur.Areasexcludedfromourpredictivemodelingoffuturesurfaceminingincludeperma- nentconservationlandsandareaswithexistinglandusesthatarenotconducivetominingac- tivities(urbananddevelopedlands,water)basedonthe2006NationalLandCoverDataset [46].Forpurposesofthiswork,weconsideredpermanentconservationlandstobe(inmost cases)landscompiledintheConservationBiologyInstitute’sProtectedAreasDatabase[47] withGapAnalysisstatus1or2.ConservationlandswithGapAnalysisstatus1and2[48]gen- erallyindicateareaswithpermanentprotectionfromlanduseconversionand/ormanagement plansdesignedtolimitdisturbanceandmayincludenationalparks,nationalwildliferefuges, stateparksandpreserves,andU.S.ForestServicewildernessareas(amongothers),although furtherassessmentofoutstandingmineralleasesonthesetractsmayresultintheirre-inclusion intheareawhereminingmayoccur.Inall,57,185km2throughoutthestudyarea(9.6%)was excludedduetolanduserestrictions,while14,366km2ofthestudyareawasexcludeddueto presenceofconservationlands(2.4%). Areaswerealsoidentifiedthatcontainedanextensiverecenthistoryofsurfacemining,as weareassumingtheseareastobe“minedout”,meaningtheywillnotbesurfaceminedagain inthefuture.Minedoutareaswereidentifiedascellswithincurrentactivesurfacemineper- mitsthatwereclassifiedasBarrenlandcoverinthe2006NationalLandCoverDataset[46]. Thismethodensuredwewerecapturinglargecontiguousareasofprevioussurfacemining, andnotnewlyopenedmines(sincewewereusing2006landcover).Byusingthismethod,we excludedminingonatotalof567km2,or12%oftheareacontainedwithinactivesurfacemine permits. Predictivecoalmodel Weusedthenon-parametricmodel,RandomForests[22],toestimatesurfacecoaldevelop- mentprobability,foreachofthe1km2cells,withhigherprobabilitiesindicatingagreaterlike- lihoodoffuturemining.TheRandomForestsalgorithmoffersmanyadvantagesinthatitdoes notadheretoparametricassumptions,canutilizemixeddatatypewithdifferentscale,handles highdimensionaldata,isrobusttooutliersandnoise,isnotsensitivetoautocorrelation,quan- tifiesimportanceofthepredictorvariablesandrequiresminimalparametrization[49–52]. TheRandomForestsmodelisaweak-learnerensembleapproach,whereaseriesofuncon- strainedClassificationandRegressionTrees(CART)arecreatedusingabootstrapsamplewith replacement.TheCART’sareconstructedusinganentropynodesplittingstatisticthatrecur- sivelypartitionsthedataintomorehomogeneoussubsetsandresultsinahierarchicalclassifi- cationthataccountsfor1stand2ndorderstatisticalvariation.Theout-of-bag(OOB)data withheldineachbootstrapsampleisusedtoassessfit,ateachmodeliteration,andresultsin convergentfitstatisticswithoutrequiringdatabewithheldforvalidation.Theindependent variablesarerandomlypermutatedthroughthenodesandmeandecreasesinaccuracythus PLOSONE|DOI:10.1371/journal.pone.0128813 June19,2015 7/25 PredictingFutureSurfaceCoalMining accumulateprovidinganimportancemeasureofeachvariable.Thepluralityofvotesacrossthe ensembleconvergesontheoptimalfittothedataandprovidesarobustestimate[49]. Tospecifyabinominalresponsevariable(y)where;presence(surfacemininghasoccurred duringthepermitperiod)andabsence(nosurfacemininghasoccurredduringthepermitperi- od),weutilizedsurfaceminepermitcentroids,onlyusingobservationspermittedaftertheyear 2000.Toensurethatstatisticalandspatialvariabilitywasrepresentedwithoutintroducinga zero-inflationissue[49],wecreatedfivesetsofpseudo-absencedatabycreatingrandompoints andthenremovingobservationsoccurringwithinacurrent-permitor0.5milesofasurface minecentroid.Foreachtrainingsubset,weusedanequalnumberofpresence(n=5,165)and absence(n=5,165)observations,withthesamepresencedatausedineachsubset.Theinde- pendentvariableswereappendedtothepoints,fromthecorrespondingrastercell(s),usingthe softwaretoolGeospatialModelingEnvironment[51]. UsingthecompliedtrainingdatawespecifiedfiveRandomForestsmodels,representing eachrandomsubset,usingtheRandomForests[53]packageinR[54].Wetestedmodelsbyre- movinglow-performingparametersandobservedadecreaseinmodelperformanceascom- paredtothefullmodel.Modelerrorconvergedinfewerthan1,000bootstrapreplicates however,sincevariableinteractionsstabilizeataslowerratethanerror,wefixedthenumberof bootstrapreplicatedatn=1,000.BecauseRandomForestsisanensembleapproach,aslongas theparameterspaceremainsfixed,independentmodelscanbecombinedintoasingleensem- ble-model[52].Usingonlyconsistentlyselectedparametersinthemodelselection,wefitfinal modelsforeachrandom-subsetandcombinedthemintoafinalensemble-model.Modelsig- nificancewasevaluatedusingapermutated(n=999)randomizationprocedureandanitera- tive10%withholdcross-validationusingtherfUtiltiesRpackage[55].Theprobabilityofthe presenceclass{1}waspredicted,usingthescaledposteriordistributionofthevoteplurality [50],withtheRrasterpackage[54].Theestimatedextentwaslimitedtotheknownextentof coalintheregion. PredictiveMapping:FutureSurfaceMiningFootprint Inordertomapfuturepotentialsurfaceminingactivitiesonalandscapescale,weusedresults fromtheprobabilisticRandomForestsmodelingofsurfaceminepotentialalongwithregional- levelestimatesoffuturecoalminingproductionfortheyears2012through2035. RegionalcoalproductionestimatesforthefourEIAcoalsupplyregions(northern,central andsouthernAppalachians,easterninterior/Illinois)(Fig3)wereobtainedusingvariouscoal productionscenariosfromtheEIA’sAnnualEnergyOutlook[2].Valueswereobtainedfortwo differentEIAeconomic/coalproductionscenariosforcomparison:alowcoalproductionsce- narioandahighcoalproductionscenario.Thelowcoalproductionscenario(“GHG25+low gas”)predictsthelowestfuturecoalproductionofanyofEIA’s28totalscenarios,duetovery restrictivegreenhousegasemissionspoliciesandlowpricesforcompetingresourcesofoiland gas.Thehighcoalproductionscenario(“lowcoalcost”)predictsthehighestcoalproduction duetolowercostsforcoalminingwages,transportation,andmineequipment(leadingtoin- creasedcoalproduction). EIAcoalproductionestimatesprovidetotalproductionestimatesonly(surfaceandunder- groundcombined).Welimitedfutureproductionprojectionstosurfaceprojectionsonlyby multiplyingeachproductiontotalbythepercentagesurfaceaccordingtothefollowingregional figuresbasedon2010–2011productiondataintheAnnualEnergyOutlook:northernAppala- chians:20.08%surface,centralAppalachians:48.68%surface,southernAppalachians:40.06% surface,easterninterior/Illinois:30.72%surface.Surfaceminingproductionestimatesfromthe PLOSONE|DOI:10.1371/journal.pone.0128813 June19,2015 8/25 PredictingFutureSurfaceCoalMining Fig3.U.S.EnergyInformationAdministration(EIA)coalsupplyregionswithintheAppalachianLandscapeConservationCooperativeboundary usedinthisproject.CountiescontainedwithinEIAcoalsupplyregions(Northern,CentralandSouthernAppalachian,EasternInterior/Illinois)were PLOSONE|DOI:10.1371/journal.pone.0128813 June19,2015 9/25 PredictingFutureSurfaceCoalMining determinedfromtheEIAAnnualEnergyOutlook[2].Theareamodeledwasfurtherlimitedtotheintersectionofthecoalsupplyregionswithgeneralizedcoal fieldboundariesfortheUnitedStates,obtainedfromtheU.S.GeologicalSurvey[25].Thisfigureshowstheintersectionofthecoalsupplyregionswithactual coalfieldboundaries.TheboundaryoftheAppalachianLandscapeConservationCooperativeisshownasathinblueline,obtainedfromtheU.S.Fishand WildlifeService. doi:10.1371/journal.pone.0128813.g003 year2012throughtheyear2035werethensummedtoproduceatotalcumulativesurfacecoal productionvalueforeachregion. Inordertoestimatesurfaceareaimpactedbycoalminingactivities,anumericrelationship wasrequiredbetweensurfacemineproductionamountsandacorrespondingareadisturbed.It wasinitiallyproposedtousecurrentactivesurfaceminepermitdataalongwithrecentproduc- tionstatisticsinordertoderiveaproductiontoarearatio.However,singleminesmayproduce coalforextendedperiodsoftime,andthismethodwouldnotadequatelycapturetheentirelife cycleofamine.Inaddition,mappedminepermitpolygonsmayincludeareasthatarenotac- tuallydisturbedduringsurfacemining,sotheactualdisturbedareamaybemuchsmallerthan mappedpermitarea.Arecentstudyconcludedthatmappedminepermitsdonotofferanac- curatewaytoestimateareadisturbedbysurfacemining,basedoncurrentpermitdatabaseand mappingmethodsusedinWVandKY[56].Instead,Lutzetal.[19]developedaregression modeltoestimatetonsofcoalproducedperunitarealdisturbancefor47countiesinsouthern WVandeasternKY.Themodelwasbasedontotalareaofsurfaceminingdisturbancefrom 1985–2005(at5yeartimeintervals),comparedwithsurfacecoalproductionstatisticsforcor- respondingtimeperiods.Lutzetal.[19]estimatedthat1tonofcoalequatesto0.87m2ofsur- facedisturbance.Forthecurrentstudy,thisfigurewasconvertedto1.15milliontonsofcoal producedpersquarekilometerofsurfacelanddisturbance. Futuresurfaceminingscenariosanalyzedincludedlowcoalproductionandhighcoalpro- ductionmodels[2]fortheyears2012–2035.Foreachscenario,wecreatedanewmaplayer showingpotentiallocationsforfuturesurfaceminingactivitiesonacell-by-cellbasisusinga 1km2gridforthestudyarea.Usingthefigureof1,150,000shorttonsperkm2,weallocatedfu- tureminingproductiononacell-by-cellbasiswithineachEIAregionfirsttothosecellswith thehighestfutureminingprobability,thencontinuingtocellswithlowerfutureminingproba- bility,untilthetotalamountoffutureproductionforaparticularscenarioandregionwasallo- cated.Priortoallocation,adjacentcellswithidenticalminingprobabilityvaluesweregrouped togethertoensurethatcontiguousareasofhighminingprobabilitywerepreservedinthere- sults(ratherthanassigning“new”miningtosinglecells).Cellscontainingurbanorbuiltup land,water,conservationlands,andcentroidsofexistingminingpermitswereexcluded (maskedout)priortobuild-outanalysisasdescribedearlier. Results RandomForestsModel(ProbabilityofFutureSurfaceCoalMining) ThefinalRandomForestsmodelscenarioincludedtheoriginal9predictorvariablesofFig2. Weexperimentedbyremovinglow-performingvariablesfromthemodelbasedonvariable contributiontotheoverallresult.However,alternativemodelswithfewervariablesdidnotper- formaswellasthefullmodel,producinghigherclassificationerrorrates.Modelsignificance wastestedvs.randomlygeneratedmodelsandwasfoundtobesignificantp=0.01. ThefinaloutputoftheRandomForestsmodelisapixelbasedprobabilityoffuturesurface miningpresence(Fig4).Asestimatedbytheout-of-bagmeandecreaseinaccuracy,thecoalge- ologytypeandthesulfurcontentwerefoundtobethemostimportantpredictorvariablesin themodel,thoughallvariablescontributed(Fig5).Foreachtrainingdataset,theout-of-bag errorestimatewasaround15%andthemisclassificationofpresenceandabsencepointswere PLOSONE|DOI:10.1371/journal.pone.0128813 June19,2015 10/25
Description: