TheComputerJournal,47(3),©TheBritishComputerSociety;allrightsreserved A System for Measuring Function Points from an ER–DFD Specification EvelinaLamma1,PaolaMello2 andFabrizioRiguzzi1 1DipartimentodiIngegneria,UniversitàdiFerrara,viaSaragat,144100Ferrara,Italy 2DEIS—UniversitàdiBologna,VialeRisorgimento,240136Bologna,Italy Email: {elamma,friguzzi}@ing.unife.it,[email protected] WepresentatoolformeasuringtheFunctionPoint(FP)softwaremetricfromthespecificationof asoftwaresystemexpressedintheformofanEntityRelationship(ER)diagramplusaDataFlow Diagram(DFD).First,theinformalandgeneralFPcountingrulesaretranslatedintorigorousrules expressing properties of the ER–DFD. Then, the rigorous rules are translated into Prolog. The measuresgivenbythesystemonanumberofcasestudiesareinaccordancewiththoseofhuman experts. Received16January2003;revised15October2003 1. INTRODUCTION Relationship(ER)diagramplusaDataFlowDiagram(DFD). ThesystemiscalledaFUNctionpointmeasurement(FUN) Software metrics are emerging as a powerful tool for the andapreliminaryversionofitappearedin[10]. managementofthesoftwaredevelopmentprocess. Software In order to develop the system, the informal IFPUG metrics allow the application of engineering principles to counting rules are specialized for the case of ER–DFD and softwaredevelopment,providingaquantitativeandobjective maderigorous. Toachievethisaim,anumberofassumptions baseforprocessandtechnologydecisions. weremade. Theinformalcountingrulesexpressedinnatural Amongsoftwaremetrics,FunctionPoints(FPs)[1,2]give languagein[3]havethusbeentranslatedintorigorousrules a measure of the size of a software system by measuring expressingpropertiesoftheER–DFDgraph. FUNhasbeen the functionalities that the system offers the user. This testedonsevendifferentcasestudieswhichobtainedresults metricisapplicablebothatthebeginningofthedevelopment veryclosetothoseofhumancounters. process, in the requirements or specification phases, and Thetoolisvaluableforthreereasons. First,itsaveshuman at the end of the process, after implementation. The effort. Second, it has been experimentally shown to be in importance of FP lies in the fact that they can be used to accordancewithhumancounters. Third,ifconsistentlyused estimate the cost of a software development project given inanorganizationoracrossorganizations,itovercomesthe the requirements or specification because in Albrecht and problemofdependencefromthecounter. Gaffney[2]ithasbeenshownthatFParehighlycorrelated withwork-hours. There exist in the market a number of commercial tools The rules for counting FP are described in the Function thathelpthesoftwareengineerintheprocessofFPcounting Point Counting Practices Manual [3], published by the though none of them is fully automatic [4]. To the best of International Function Point User Group (IFPUG) [4]. our knowledge, the only tool that is capable of automati- Recently, IFPUG FP counting rules have become the ISO callycountingFPisdescribedin[11]andwasdevelopedin standardnumber20926[5]. academia. Suchasystemstartsfromaspecificationwritten Recently,therehasbeenadebateabouttheuseofFP[6,7]. inUML. The main problem with FP is that they are not completely FUN has been implemented in Sicstus Prolog ver- independent from the person doing the count. Kemerer [8] sion 3#5 [12] and is available from http://www.ing.unife.it/ reported a 12% difference for the same product used by software/FUN/. peopleinthesameorganization, whileLowandJeffery[9] The paper is organized as follows. In Section 2, we reporteda30%variancewithinanorganizationandmorethan describetheFPmeasurementprocess. InSection3,werecall 30%acrossdifferentorganizations. Inordertoovercomethis ERandDFDspecificationsandpresenttheirintegrationinto problem,Kitchenham[7]suggestedsimplifyingthecounting ER–DFDspecifications. Section4describestheapplication rules so that FP can be automatically counted from early ofFPrulestoER–DFDandSection5illustratesthesystem systemrepresentations. implementation in Prolog. Section 6 shows the application Inthespiritofthesuggestionof[7],thispaperproposesa ofthesystemtoanumberofcasestudies. Relatedworksare systemfortheautomaticcountingofFPfromaspecification discussedinSection7. Conclusionsanddiscussionoffuture of the software expressed in the form of an Entity workfollowinSection8. TheComputerJournal, Vol.47, No.3, 2004 ASystemforMeasuringFunctionPoints 359 2. FUNCTIONPOINTMEASUREMENTPROCESS is performed and that the boundary of the application is indicated in the ER–DFD diagram. We do not automate FP measurement rules are defined in the IFPUG Counting Step(vi)becauseitrequiresmanynotionsontheapplication Practices Manual [3]. The method is based on identifying and the environment that are not present in the ER– andcountingthefunctionsthatthesystemhastoprovide,i.e. DFD specification and therefore are difficult to formalize. InternalLogicalFiles(ILFs),ExternalInterfaceFiles(EIFs) Moreover,manycompaniesprefertoconsideronlytheISO (data functions), External Inputs (EIs), External Outputs standardizedUFPcount. (EOs) and External Inquiries (EQs) (transaction functions). Eachfunctionidentifiedinthesystemisthenclassifiedinto three levels of complexity (simple, average and complex), 3. ENTITYRELATIONSHIP—DATAFLOW and an FP number is assigned to each function according DIAGRAMS to its type and complexity. The rules for identifying the Wewillperformthemeasurementonthespecificationofthe functionsandfordeterminingtheircomplexityareexpressed applicationexpressedbyanERdiagram[13]integratedwith innaturallanguageandtheyrefertoanumberofhigh-level a DFD [14]. We consider an integration of the diagrams abstractionsdefinedinthemanual[3]. Ruleshavebeenkept which is similar to Formal DFDs [15]: the data stores of informalandabstractsothattheycanbeappliedtoanykindof DFD are replaced by entities and relationships of the ER descriptionofthesystem,fromarequirementdocumenttoan diagrams; therefore, we have data flows entering directly implementationofthesystem. However,asaconsequence, into entities and relationships, and data flows coming out they are, to a certain extent, vague and not completely free from them. Moreover, we distinguish three types of data fromambiguities. flows: properdataflowsthatrepresenttheexchangeofdata; The sum of the FP contributions from all the functions error flows that represent the exchange of error messages givestheunadjustedFPcount(UFP): and control flows that represent the exchange of control (cid:1) (cid:1) UFP= w x information. We call such an integrated diagram an ER– ij ij DFD.InordertodistinguishbetweenelementsoftheDFD i(cid:1) Typesj(cid:1) Complexity and ER diagram, which have a similar graphical symbol, where Types={ ILF, EIF, EI, EO, EQ} , Complexity= we adopt the following conventions: external agents (the { simple, average, complex} , wij is the number of FPs user or other applications) of DFD are represented with a assigned to a function of type i and of complexity j and dashed line box to distinguish it from entities represented xij isthenumberoffunctionsoftypeithathavecomplexity as normal boxes. Data flows, error flows and control flows j. Thevaluesofwij aregivenbytheIFPUGmanualwhile are represented by arrows to distinguish them from the thevaluesofxij arecomputedbythecounter. connections between entities and relationship represented The final FP count is then obtained by multiplying the as simple lines. Data flows, error flows and control flows unadjustedcountbyanadjustmentfactorthatexpressesthe are distinguished on the basis of the line of the arrow: influenceof14generalcharacteristicsofthesystemonwhich continuousfordataflows,dashedforerrorflowsanddotted theapplicationwillrun. for control flows. Figure 1 shows a sample diagram. Note FPcountisthusperformedinsixsteps: that either the symbol ‘1’ or ‘M’ (for many) is associated to each connection between an entity and a relationship (i) identifyingthetypeofFPcount: developmentproject, to represent the functionality of a relationship: e.g. in enhancementprojectorapplication; Figure 1, relationship rel1 is one to many from entity2 to (ii) identifyingtheboundaryoftheapplicationsubjectto entity1, meaning that an occurrence of entity1 appears in themeasure; therelationshipatmostonetimeandoccurrencesofentity2 (iii) identifying the data functions, classified as ILF and appearintherelationshipatmostMtimes. EIF,andevaluatingtheircomplexitybycountingthe A number of fields are associated with each data flow: number of Data Element Types (DET) and Record when a field has the same name as the attribute of an ElementTypes(RET)foreachfunction; entity, theyrefertothesamedata. Whenthefielddoesnot (iv) identifying the transaction functions, classified as correspond to any attribute, it represents data derived from EIs, EOs and EQs and evaluating their complexity attributesbycomputation. Errorflowsandcontrolflowsdo by counting the number of DET and File Types nothaveanyfieldsassociatedwiththem. Referenced(FTR)foreachfunction; Wesupposethatthediagramalsocontainstheindication (v) determining the number of UFP by summing the oftheboundariesofthedifferentapplicationsintheformof contributionsofallfunctions;and dashedlines. (vi) computingthefinalnumberofFPbymultiplyingthe InFigure1,twoprocessesareshown. Ascanbeseen,the UFPcountbytheadjustmentfactor. twoprocessescanexchangedatabyusinganelementofthe Our aim is to automate Steps (iii)–(v) starting from the ERdiagram: inthisexample,thedatathatisreadbyProcess2 specificationofthesystemintermsofanER–DFDdiagram. fromentity1iswrittenbyProcess1. Steps (iii) and (iv) are the most complex, time-consuming In order to perform the count, a number of assumptions and prone to error, therefore they are the most interesting on the ER–DFD graph have been made. We assume that to automate. We assume that a development project count every attribute of an entity or a relationship is unique, user TheComputerJournal, Vol.47, No.3, 2004 360 E.Lamma,P.MelloandF.Riguzzi FIGURE1. ExampleofanER–DFDdiagram. recognizable and non-repeated in the sense of [3]. This within the boundary of the application. The primary assumptionwillbeclarifiedinSection4.2. intentofanILFistoholddatamaintainedthroughone Each process in the DFD must be an elementary process or more elementary processes of the application being inthesenseof[3]: counted[3]. An elementary process is the smallest unit of activity Let us now introduce the terminology used in IFPUG’s that is meaningful to the user(s). ...The elementary manual. process must be self-contained and leave the business oftheapplicationbeingcountedinaconsistentstate[3]. ControlInformationisdatathatinfluencesanelementary process of the application being counted. It specifies Assuming that every process in the DFD is an elementary what,when,orhowdataistobeprocessed[3]. processisreasonableifwesupposethattheDFDprocesses reflectthebasicactivitiestobeperformedbytheapplication. Thetermuseridentifiablereferstodefinedrequirements Weassumealsothattheprocessinglogicofeveryprocessin for processes and/or groups of data that are agreed the DFD is unique. This assumption is not very restrictive upon, andunderstood, byboththeuser(s)andsoftware since it is reasonable for a DFD diagram not to have developer(s)[3]. duplicatedprocesses. Thetermmaintainedistheabilitytomodifydatathrough anelementaryprocess[3]. 4. RULESFORCOUNTINGFPsFROMER–DFD TheIFPUGidentificationruleforILFsis: agroupofdataor In this section, we present the rules for counting FPs from controlinformationisanILFifitsatisfiesallofthefollowing the specification of an application expressed in the form of conditions: an ER–DFD. For each function, we have translated IFPUG informal rules into rigorous rules expressing properties of (cid:128) Thegroupofdataorcontrolinformationislogical theER–DFDgraph. Therulesobtainedarerigorousbecause anduseridentifiable. alltheambiguitiesandvaguenessofIFPUGruleshavebeen (cid:128) The group of data is maintained through an removed. Thus, it was easy to translate them into code. elementaryprocesswithintheapplicationboundary In order to remove ambiguities and vagueness, we have beingcounted[3]. interpretedIFPUGrulesinthewaywethoughtwasmorerea- sonable. Todoso,wehadtomakeanumberofassumptions When applying the above rule to ER–DFD, groups of thatarelistedinSection3andarereportedbelow. logically related data are represented by sets of connected In Sections 4.1 and 4.2, we discuss identification and entities and relationships while elementary processes are complexity rules for data functions. In Sections 4.3 and represented by processes of the DFD. As discussed in 4.4, we describe identification and complexity rules for Section3,wehaveassumedthateveryprocessintheDFDis transactionfunctions. anelementaryprocess. Wealsoneedthefollowingdefinition: 4.1. Identificationrulesfordatafunctions Definition1. (Maintained)Anentityorarelationshipis DatafunctionsareILFandEIF. maintained by a process if and only if there is a data flow fromtheprocesstotheentityorrelationship. Asetofentities 4.1.1. Internallogicalfiles andrelationshipsismaintainedbyaprocessifandonlyifthe TheIFPUGdefinitionofanILFis: processmaintainsalltheelementsoftheset. Aninternallogicalfile(ILF)isauseridentifiablegroupof We can present now the ILF identification rule for logicallyrelateddataorcontrolinformationmaintained ER–DFD. TheComputerJournal, Vol.47, No.3, 2004 364 E.Lamma,P.MelloandF.Riguzzi FIGURE8. SimplestcaseofEO. FIGURE6. SimplestcaseofEIofcontrolinformation. 4.3.3. Externaloutput An external output (EO) is an elementary process that sendsdataorcontrolinformationoutsidetheapplication boundary. The primary intent of an external output is topresentinformationtoauserthroughprocessinglogic otherthan,orinadditionto,theretrievalofdataorcontrol information. Theprocessinglogicmustcontainatleast FIGURE7. SimplestcaseofanEQ. onemathematicalformulaorcalculation,createderived data,maintainoneormoreILFsoralterthebehaviorof The simplest case of EI of data is shown in Figure 2. The thesystem[3]. simplest case of an EI of control information is shown in Figure8showsthesimplestcaseofanEO. Figure6. What distinguishes an EO from an EQ is the fact that InordertoidentifyEIsintheER–DFDdiagram,wehave an EQ does not elaborate the retrieved data, while an EO toconsidertheprocessesinthediagram. Wehaveassumed outputsderiveddata. Therefore,wehavethefollowingrule inSection3thattheprocessinglogicofeveryprocessinthe forER–DFD. DFDisunique. Rule7. (EOidentification)AprocessintheER–DFDis Rule 5. (EI identification) A process of the ER–DFD anEOif belongingtotheapplicationisanEIif (1) thereisatleastonedataflowfromafiletotheprocess; (1) there is at least one data flow from the outside to the (2) thereisatleastonedataflowfromtheprocesstothe process, outside; (2) theprocessmaintainsatleastoneelementofanILF (3) data flows from the process to the outside contain at or leastonefieldthatisnotcontainedinanyofthedata (1) thereisaflowofcontrolinformationfromtheoutside flowsfromILFstotheprocess. totheprocess. 4.4. Complexityrulesfortransactionalfunction 4.3.2. Externalinquiry InordertoassigntherightnumberofFPtoeachidentified An external inquiry (EQ) is an elementary process that transactionalfunction,wehavetocounttheDETsandFTRs sendsdataorcontrolinformationoutsidetheapplication associatedwiththefunction. ThedefinitionforDETsisthe boundary. The primary intent of an external inquiry is same as the one for data functions. The counting rule for topresentinformationtoauserthroughtheretrievalof DETsofEIis: data or control information from an ILF of EIF. The processing logic contains no mathematical formulas or Rule8. (EI,DETcounting)CountoneDETforeachfield calculations, and creates no derived data. No ILF is indataflowsfromexternalsourcestotheEI.CountoneDET maintainedduringtheprocessing,noristhebehaviorof foreacherrorflowfromtheEI. thesystemaltered[3]. TheIFPUGdefinitionofanFTRis: Figure7showsthesimplestcaseofEQ. AFileTypeReferencedis Rule6. (EQidentification)AprocessoftheER–DFDis (cid:128) an Internal Logical File read or maintained by a anEQif transactionalfunction, (1) there is at least one data flow from the outside to the (cid:128) an External Interface File read by a transactional process; function[3]. (2) there is at least one data flow from the process to the Recallingthedefinitionofasetofentitiesandrelationships outside; (and hence of a file) maintained given in Section 4.1.1, (3) thereisatleastonedataflowfromanelementofafile andreferencedgiveninSection4.1.2, wecannowgivethe totheprocess; countingruleforFTRs: (4) allthefieldsofdataflowsgoingoutsidetheapplication boundary are among the fields of data flows to the Rule9. (EI,FTRcounting)CountoneFTRforeachILF process; maintained or referenced by the process and one FTR for (5) noelementofanILFismaintainedbytheprocess. eachEIFreferencedbytheprocess. TheComputerJournal, Vol.47, No.3, 2004 366 E.Lamma,P.MelloandF.Riguzzi whenentitynamehassub-entitieschild1,...,childn. presentedinSection4.1.1: ilf(Appli,ILF)succeedsif Argumentstotalandexclusivehavetobereplacedby ILF is a list containing the entities and relationships of an theBooleanconstants0and1statingwhetherthehierarchy ILFfortheapplication Appli. TheILFidentificationrule is total and/or exclusive. Each sub-entity must then be is implemented by the following Prolog clause (in Sicstus representedbythefact Prologsyntax): subentity(name, [attrib1,...,attribm]). ilf(Appli,ILF):- application(Appli,EntList,RelList, stating that sub-entity name has attributes ProcList), [attrib1,...,attribm]. % pick a process Proc of Appli A relationship between entities is mapped into a Prolog member(Proc,ProcList), factoftheform: append(EntList,RelList,ERList), % find the ent. and rel. that are maintained relationship(name, ent1, ent2,[attrib1, ..., by Proc attribn], card1, card2). findall(ER,dataflow(Proc,ER,_),ILF1), where name is a binary relationship between ent1 and delete(ILF1,user,ILF2), ent2 with attributes attrib1,..., attribn and remove_duplicates(ILF2,ILF), cardinalitycard1froment1andcard2toent2. card1 % verify that ILF is inside the boundary of andcard2caneitherassumevalue1orm. Thecardinalityof Appli, therelationshipisactuallyneverusedbytherulesdescribed subset(ILF,ERList), above, but we have decided to store this information for % that is not empty, possibleextensionsofthecurrentsystem. ILF \== [], A data flow from sour to dest with fields % that is connected field1,...,fieldnisrepresentedbyaPrologfactof connected(ILF), theform: % that no partitions are ILF \+ some_partitions_are_ILFs(Appli,ILF). dataflow(sour,dest,[field1,...,fieldn]). We use the Sicstus built-in predicate findall Acontrolflowfromsourtodestisrepresentedby (Template,Goal,Bag) that assigns to Bag a list of instances of Template returned by each proof of controlflow(sour,dest). Goal found by Prolog. All variables in Goal are taken andanerrorflowfromsourtodestisrepresentedby as being existentially quantified. The call findall(ER, dataflow(Proc, ER,_), ILF) returns in ILF the errorflow(sour,dest). list of all the entities and relationships maintained by the processProc. Thepredicatesubset(Sublist,List) As an example, the ER–DFD reported in Figure 1 is verifiesthatallelementsof SublistareinList. representedas: application(appli,[entity1,entity2], 6. CASESTUDIES [relationship1],[process1,process2]). In this section, we describe the application of FUN to a application(user,[],[],[user]). number of software systems. The first is an application entity(entity1,[a1],[a2]). forthemanagementofHumanResourcesthatisthesubject entity(entity2,[b1],[b2]). of a series of case studies [17, 18, 19] of FP measurement relationship(relationship1,entity1,entity2, published by IFPUG. Here we will consider [18] in which [c1],1,m). measurementisperformedstartingfromthespecificationof dataflow(user,process1,[info]). theapplicationexpressedasanERdiagramandaDFD. dataflow(process1,relationship1,[c1]). Theaimoftheapplicationistomanageinformationabout dataflow(process1,entity1,[a1,a2]). employeesofafirm. Inparticular,theuserrequiresstoring dataflow(process1,entity2,[b1,b2]). information about each employee, including data on the dataflow(entity1,process2,[a1,a2]). dependantsoftheemployee,dataonthesalaryorthehourly dataflow(process2,user,[a1,a2,total]). rate and data on the work location. The location must be a controlflow(user,process1). validlocationintheapplicationFixedAsset. Iftheemployee errorflow(process1,user). works abroad, the hourly rate must be converted to US dollarsbyaccessingtheapplicationCurrencyandretrieving 5.3. Exampleoftranslationofthecountingrules the conversion rate. Moreover, the application has to store intoProlog informationaboutdifferentjobs. Finally,theuserrequiresto In this section, we show how the rule for identifying ILFs storeinformationabouttheassignmentofjobstoemployees. has been translated into Prolog. The predicate ilf/2 is Table4showstheattributesofentitiesandrelationshipsand used to identify ILFs, according to Rule 1 and Definition 1 Figure9theERdiagram. TheComputerJournal, Vol.47, No.3, 2004 ASystemforMeasuringFunctionPoints 367 TABLE4. Attributesofentitiesandrelationships. Entitiesorrelationships Attributes Employee Social_Security_Number(key),Name,Nbr_Dependents,Type_Code. Salaried Supervisory_level Hourly Standard_Hourly_Rate,US_Hourly_Rate,Collective_Bargaining_Unit_Number Dependant Dep_SSN(key),Dep_name,Dep_birth_date Job Name,Job_Number(key),Pay_grade Description Line_Number,Description_Line JobAssignment Effective_Date,Salary,Performance_Rating,Status_Inactive,System_Date Location Location_Name(key),Address,City,State,Zip,Country. Currency Currency_Location(key),Base_Currency,Conversion_Rate_To_Base_Currency,Date_Of_Rate. FIGURE9. CompleteERdiagramfortheHumanResourceapplication.1 Theprocessesthattheuserrequiresareadding,changing, Among these processes, we will consider here in more inquiring and reporting information about employees, detailthefollowing: jobs and job assignments. In inquiring, the user asks (cid:128) adding an employee, together with data on his for information regarding an employee, a job or a job dependantsandthesalaryorhourlywage(seeFigure10 assignment given respectively, a Social_Security_Number, fortheadditionprocess); a Job_Number or a Social_Security_Number and a (cid:128) reporting on all employees, printing the list of Job_Number. In reporting, the application prints a list of employeestogetherwiththeirtotalnumber(Figure11); allemployees,jobsorjobassignmenttogetherwiththetotal (cid:128) inquiring on the data of an employee, given his social numberofshownentities. Moreover,theapplicationshould securitynumber(Figure12); alsoallowinquiringonlocationinformation(processInquire (cid:128) adding job information, together with its description Locations): itshouldprint,foreachemployee,his/hersocial (Figure13);and securitynumbertogetherwiththeinformationonthelocation (cid:128) addingajobassignment(Figure14). wherehe/sheisworking. After having loaded from the Sicstus interpreter both the 1Thisdiagramdiffersfromtheonein[19]becausetherelationshipEmp. system code and the application description, the count is Currencyhereisinsidetheapplicationboundary. Webelievethisdiagram startedbycallingthegoal: ismorecorrectsincetherelationshipEmp.Currencyismaintainedbypro- cessesinsidetheapplication. | ?- totalFP(hr,FP). TheComputerJournal, Vol.47, No.3, 2004