ebook img

Towards Faithful Graph Visualizations PDF

0.82 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Towards Faithful Graph Visualizations

JOURNALOFLATEXCLASSFILES,VOL.1,NO.1,JANUARY2016 1 Towards Faithful Graph Visualizations QuanHoang Nguyen,Peter Eades,andSeok-Hee Hong Abstract—Readabilitycriteriahavebeenaddressedasameasurementofthequalityofgraphvisualizations.Inthispaper,we arguethatreadabilitycriteriaarenecessarybutnotsufficient.Weproposeanewkindofcriteria,namelyfaithfulness,toevaluate the quality of graph layouts. We introduce a general model for quantify faithfulness, and contrast it with the well established readabilitycriteria.Weshow examplesofcommon visualizationtechniques,such asmultidimensionalscaling,edgebundling andseveralothervisualizationmetaphorsforthestudyoffaithfulness. 7 IndexTerms—faithfulness,faithfulgraphvisualization,informationfaithful,taskfaithful,changefaithful. 1 0 ✦ 2 n1 INTRODUCTION a JVISUALIZATION aids human acuities for pattern 4 detection and knowledge discovery by enabling transformation or deduction from information to ] Gknowledge through visual means. Visualization has beenwidelyusedforillustrating,formulatinghypoth- C (a)Originalgraph (b)Edgecontraction esis, identifying patterns, constructing knowledge, . sperforming tasks and making decisions. Fig.1. Anexampleofedgeconcentration[59] c [ Technologicaladvanceshavedramaticallyincreased the volume of network data generated in modern 1 applications.Assuch,graphsbecomelargerandmore v 1complex.Graphvisualizationtechniques,tocopewith 2this, are getting more sophisticated and involving 9complexparametersettingstoturngraphsintodraw- 0 ings. Thus, the major challenge is to justify how 0 reliable visualizations methods and models are. . 1 Graphdrawingalgorithmsdevelopedoverthepast 0 30yearsaimtoproduce“readable”picturesofgraphs. 7 Here “readability” is measured by aesthetic criteria, 1 :such as: v 1) Crossings:thepictureshouldhavefewedgecross- (a)Originalgraph (b)Confluentdrawing i X ings. Fig.2. Anexampleofconfluentdrawing [20] r 2) Bends: the picture should have few edge bends. a of the data. The presumption may be clearly true 3) Area: the area of a grid drawing should be small. in traditional ways of visualization using node-link Readabilitycriteriahavebeenextensivelystudiedin diagrams, which are comprised of a fairly limited numerousvisualizationsystems[72],[75].Algorithms number of nodes and edges. However, for modern that attempt to optimise aesthetic criteria have been visualization metaphors, such as 2.5D visualizations, successfully embedded in systems for analysis in a map-based visualizations, matrix representations and widevarietyofdomains,fromthefinanceindustryto their combinations, the readability criteria is not suf- biotechnology. ficient.Further,thepresumptioncanbedemonstrably In this paper, we argue that readability criteria for false because of the extensive use of clutter reduction visualizing graphs, though necessary, are not suffi- techniques, such as edge bundling. cient for effective graph visualization. Here,weintroduceanotherkindofcriterion,gener- Traditionally, it is quite commonplace for the “pre- ically called “faithfulness”, that we believe is neces- sumption”thatqualityofvisualizationismeasuredby sary in addition to readability. Intuitively, a graph the readabilityof the visualization; see [65], [68], [72], drawing algorithm is “faithful” if it maps different [75]. Such readability criteria presumes that readabil- graphs to distinct drawings1. Using mathematical ity implies thatthe picture is a faithful representation terms, a faithful graph drawing algorithm encodes an injective function. In other words, a faithful graph • Q.Nguyen,P.EadesandS.HongarewiththeSchoolofInformation Technologies,UniversityofSydney,Australia. 1.InMathematics,afaithfulrepresentationofagrouponavector E-mail:{quan.nguyen,seokhee.hong,peter.eades}@sydney.edu.au space is a linear representationin which differentelementsin the grouparerepresentedbydistinctlinearmappings. JOURNALOFLATEXCLASSFILES,VOL.1,NO.1,JANUARY2016 2 drawing algorithm never maps distinct graphs to the same drawing. Weshowseveralexamplesbelowforillustratingthe new faithfulness criteria. 1.1 Motivatingexamples Manygraphvisualizationalgorithmshaveembedded one or more aesthetic criteria to achieve readablelay- outs.Insomecaseshowever,graphlayoutalgorithms cannot avoid visual clutter or edge cluttering due (a)Unbundled (b)Bundled to high edge density from intrinsic connectivity and Fig. 3. An example graph using force-directed edge overlapping between nodes and edges. bundling Insuch cases,edgeconcentration[59],confluent draw- ing [14], [19] and edge bundling [27], [37], [81] become useful. These edge routing algorithms share the same idea: they simplify edge connections in the picture to increase readability. Such improved readability is helpful with respect to a number of tasks, yet the pictures may become less faithful. As an example, edge concentration [59] simplifies edgeconnectionsinthepicturetoincreasereadability. Figure 1 shows two pictures of a bipartite graph; Figure1aisasimpledrawingwithstraight-lineedges (a)Unbundled (b)Bundled andFigure1bisanotherdrawingwith“concentrated” Fig.4. A10percentmodificationoftheexamplegraph edges. Figure 1b has less edge crossings and is more in Figure3(a) andthe resultusingforce-directededge readable. However, Figure 1b would also give a bundling viewer, who does not know how the picture was made, at the first sight: a graph comprising of ten 1.2 Aimsandcontributions circlenodesandtwoboxesconnectedbytwelvelines. This demonstrates the lack of faithfulness of edge The examples in previous section have demonstrated concentration. thatfaithfulnessisanimportantaspectof graphvisu- For our second example, Figure 2a shows another alizations. Yet, faithfulness has yet been paid enough bipartite graph; the confluent drawing of the graph attention by the visualization community. There area is depicted in Figure 2b. Confluent drawings may be fewnotions alongthe linesoffaithfulnessinscientific notfaithful.Forexample,onemayfindthereisnolink visualization,suchasfidelityofthepicture[56],andvi- connectingthetworedcirclesinFigure2a.Incontrast, sualreconstructabilityforflowvisualization[45].Inad- one may see from Figure 2b a curve connecting the dition,severalformalmodelshavebeenproposed[9], two red circles. This is clearly an inconsistency in the [10], [50], [67], [78]; they aim for assessing the visual- confluently drawn graph. izations as well as for guiding the future of research As an example of edge bundling, Figure 3 shows inInformationVisualization.Nevertheless,thereisno two pictures of the same graph. Figure 3a is a simple model of faithfulness for graph visualization. circular layout with straight-line edges and Figure 3b In this paper, we distinguish two important con- has the same node positions but with “bundled” cepts: the “faithfulness” and the readability of visual- edges. izations of graphs. Faithfulness criteria are especially In the example of edge bundling, bundling often relevant for modern methods that handle very large sacrifices faithfulness. Remarkably, while the visual and complex graphs. Information overload fromvery effects of bundled edges give a more readable visual large data sets means that the user can get lost in representation of the overall graph, the ability to irrelevantdetail,andmethodshavebeendevelopedto locate, select or navigate individual edges, hopping increasereadabilitybydecreasingdetailinthepicture. between nodes is lost. Even worse, bundling can With a vast amount of methods and techniques result in situation where two different graphs can be for visualizing the data sets at various levels of mapped to the same picture. For example, Figure 4a granularity, it is important to know how reliable the shows a graph that differs from the graph Figure 3a visualizations being produced are. This demand has by almost 10 percent of the total number of edges. become urgent given the popularity of network data, This graph is bundled in Figure 4b. The bundled theinformationoverloadfromtechnologicaladvances representationsof the two differentgraphs (Figure3b and various challenges for visualizing large complex and Figure 4b) are identical. and dynamic networks. JOURNALOFLATEXCLASSFILES,VOL.1,NO.1,JANUARY2016 3 In summary, we make the following contributions: visualization and the effects on interpretations with • General model of graph visualization: In order respect to the intended story [43]. to describethe faithfulness concept, we presenta Thesepreviousresearchonvisualizationevaluation general model of graph visualization process in has mainly focused on information visualization in a Section 3. broad sense, while our researchaddressesthe quality • General model of the faithfulness: Here, the of graph visualizations. faithfulnessconceptisdividedintothreetypesof Recently, an algebraic process to evaluate visual- faithfulness: information faithfulness, task faith- ization design has been proposed [48]. Their notions fulness and change faithfulness. of representation invariance, unambiguous data depiction • Examples of faithfulness metrics: We further and visual-data correspondence are close to our concept define some examplesof the faithfulness metrics, of information faithfulness and change faithfulness. which aim to compare the usefulness / effective- They evaluated their approach on bar charts and ness of different visualizations. We should em- scatterplots. In contrast, our approach aims for the phasize that these metrics are just the examples quality (the faithfulness) of graph visualizations. and are not the “only” ones. • Case studies of faithfulness: As for demon- 2.2 Readabilityingraphvisualization stration, we use popular visualization methods including force-directed methods, multidimen- Graph drawing algorithms in the past 30 years typ- sionalscaling,matrixrepresentations,andhybrid ically take into account one or more aesthetic criteria metaphors to evaluate our faithfulness concept. to aim to increase the readability of the drawing and to achieve “nice” drawings. There are a wide range Therestofthepaperisoutlinedasfollows.Section3 ofaestheticcriteriaproposedforgraphvisualizations. presents our general model of graph visualization Theyinclude,forexample,(1)minimizingthenumber process. We use this visualization model to describe of edge crossings minimization [69]; (2) minimizing the faithfulness concept. Section 4 describes our gen- the number of bends in polyline edges [69]; (3) in- eralmodelofthe faithfulnessofagraphvisualization creasing orthogonality: placing nodes and edges to method. Here, we divide the general concept into an orthogonal grid [69]; (4) increasing node distribu- three kinds of faithfulness: information faithfulness, tion [13], [73]. (5) maximizing minimum edge angles task faithfulness, and change faithfulness. Section 5 between all edges of a node [66]; (6) minimizing the shows our examples of faithfulness metrics, which total area [73]; (7) maximizing the symmetries in the are computed for quantifying and comparing the underlying network structure [28]; (8) edge lengths faithfulness of different visualizations. We illustrate should be short but not too short [11]. faithfulnesswithseveralexamplesinSections6,7and Amongst these aesthetics, minimizing edge cross- 8: multidimensional scaling, edge bundling and sev- ings is the most important criterion from previous eral selected visualization metaphors (matrix-based user studies [65]. and map-based visualizations). Section 10 concludes Optimizing two or more criteria simultaneously thepaperwitharemarkaboutthefailureof3Dgraph is an NP-hard problem in general. As such, graph drawing to make industrial impact. drawings are often the results of compromise among several aesthetics. Most of previous research has also 2 RELATED WORK aimed for computational efficiency while achieving 2.1 Evaluationofvisualization the drawing readability. There have been increasing interests in evaluative In the literature, graph drawing algorithms can be research methodologies and empirical work [9], [10], generally classified, for example, in one the follow- [50], [51], [67], [78]. ing criteria: (1) A popular technique is force-directed Thereareseveralnotionsalongthelinesoffaithful- layout, which uses physical analogies to achieve an nessinscientificvisualization.Theseincludefidelityof aestheticallypleasingdrawing[3],[5],[46];(2)Several the picture [56], and visual reconstructability for flow worksextendedforce-directedalgorithmsfordrawing visualization[45]andfaithful“functionalvisualization largegraphsin[32],[79];(3)Multidimensionalscaling schema” introduced by Tsuyoshi et. al. [71]. This no- is another popular method for graph visualization tion is to judge the transformation from data schema and visual mining [7]; (4) Another approach takes to visual abstraction schema. advantage of any knowledge on topology (such as A number of quality metrics have been recently planarity or SPQR decomposition [17], [21], [31], [58], proposed for evaluating high-dimensional data visu- [59]) to optimize the drawing in terms of readability; alization[4]. A surveyof quality concerns for parallel (5) Other approaches offer representations composed coordinates is given in [12]. There is also proposal of visual abstractions of clusters to improve readabil- for judging visualizations regarding the presence ity. of difficulties and distracting elements (or so-called The faithfulness criterion proposed in this paper is “chartjunk”) [42]. Other researchfocuses on narrative important to compare the usefulness of graph visual- JOURNALOFLATEXCLASSFILES,VOL.1,NO.1,JANUARY2016 4 Data Picture Human izations. Thisfaithfulnessisdifferentfromreadability S E criteria in the literature. dS/dt 2.3 Mental map preservation in graph visualiza- tion D V L P dK/dt K An important criterion for dynamic graph drawings [ Faithfulness ] [ Readability ] is mental map preservation [18] or stability [63]. The mental map concept refers to the abstract geometric Task structures of a person’s mind while exploring visual information. The better the mental map is preserved, T R the easier the structural change of a graph is under- stood. Fig.5. Graphvisualizationmodel Topreservethementalmap,somelayoutadjustment algorithms use a notion of proximity and rearrange a A graph G = (N,E) consists a set of nodes N and drawing in order to improve some aesthetic criteria. a set of edges E. Note that, in this paper we use N These algorithms include, for example, incremental to denote the set of nodes and preserve V to denote drawingofdirectedacyclicgraphs[62],orcomputing visualization. In practice, the nodes and edges may the layout of a sequence of graphs offline [15], or have multiple attributes, such as textual labels. For using differentadjustment strategies in order to com- example, for Facebook friendship networks, a node pute the new layout [41]. Other works have studied may be associated with names, education, marriage thementalmappreservationwhile thegraphisbeing status, current position, etc; whereas an edge may updated [23], [53], [54]. This can be achieved by using have relationship types between two persons. These force directed layout techniques [23], or by using attributescanbeimportantforvisualizationandanal- simulated annealing [53]. ysis. Further, nodes and edges may be timestamped, There are some trade-offs between the readability and the visualization varies over time. and the stability of (offline) dynamic graph drawing Figure 5 shows our visualization model. It is an methods (see a survey by Brandes et al. [6]). extension of the van Wijk model [78]. Our model In contrast to the mental map in the previous encapsulatesthewhole knowledge discoveryprocess, work,thispaperdefines“changefaithfulness”,which from data to visualization to human; unlike the van captures the sensitivity of the pictures to changes Wijk model, our model includes tasks. in the data. This change faithfulness aims for the Themainprocessesofthemodelare“visualization” understanding and evaluation of dynamic graph vi- V, “perception” P, and “task” T, and described as sualization. follows. 2.4 Temporalandspatialanalysis 3.1 Visualization Foranalysisofdynamicgraphs,itcommonlyrequires to show the statistical trends and changes over time. The visualization process maps a data item d ∈ D (an Visualization of dynamic graphs also needs to pre- attributed graph) to a layout item (sometimes called serve user’s mental map [57]. The most common a “picture”) ℓ = V(d) ∈ L according to a specification techniques for representing temporal data are via s∈S. We write this as follows2: animation and the “small multiples” display (see [2]). V :D×S →L, (1) The animation approach shows visualizations of the sequence of graphs displayed in consecutive frames. with data space D, specification space S and layout The small multiples display uses multiple charts laid space L side-by-side and corresponding to consecutive time • The type of data space D to be visualized can periods or moments in time [1]. vary from a simple list of nodes and edges to Generally speaking, this previous work has been a time-varying graph with complex attributes focusedonthespacedimension.Forexample,mostof on nodes and edges. There are a vast amount the workconsidersreadabilityandstabilityormental of network data and many of them have been map preservation of 2D / 3D drawings of graphs. In generated in fairly high rates. this paper,we areconcerned aboutthe faithfulnessin • The specification space S includes, for example, both the space and the time dimensions. a specification of the hardware used such as 3 GRAPH VISUALIZATION MODEL the size and the resolution of the screen. The In this section we describe a semi-formal model for 2.For this semi-formal model, we use mathematical notation graph visualization. The section first gives basic an- moreasaconcise shorthandratherthanaprecisedescription.For example,we describe processes asfunctions,but weshould warn notations of graphs and then describes our graph thereaderthatthedomainsandrangesofthesearesometimesill- visualization model. defined. JOURNALOFLATEXCLASSFILES,VOL.1,NO.1,JANUARY2016 5 specification can also include parameter inputs as exploratory visualization, the information that is for visualization, navigation or interaction. contained in the data is not known a priori, and we • The layout space L may consist of graph draw- make pictures to get serendipitous insight. This is a ingsintheusualsense,butmoregenerallyconsist well-known paradox;we donot knowinadvancethe of structured objects in a multidimensional geo- aspects or features that should be visible and thus it metricspace.Sometimesitisconvenienttoregard is hard to assess how successful we are. the layout space as the screen; in this case, using In terms of perception, the human perceptual sys- thelanguageofComputerGraphics,itisanimage tem has several constraints. Some visual channels are space. probablymorerepresentational,expressiveorperceiv- In many applications, nodes and edges of a graph able than others. For instance, size and length are inthedataspaceDmaycontainseveralattributes.For more effective for quantitative data. Yet, for ordinal or example, a social network fromFacebook have nodes nominal data, size and length are less useful. In other representing people and edges representing their so- example, one person can distinguish some pairs / cial connections. In this network, nodes (people) may groups of colors more easily than other persons can. have other information such as age, gender, identity, Sometimes, the effectiveness of a visual channel may maritalstatus,education,andmanymore.Edgesmay varybetweenpeople;differentpeoplemayperceivea be associated with additional information of whether picture differently. they are classmate, colleague or family relationships. Typically,torepresentdifferentattributesinthelay- out space L, the drawings may comprise of a variety of visual cues, such as color, shape or transparency. 3.3 Task For classic drawing algorithms, the visualization pro- cess may compute the layout directly from the data. Forincrementalalgorithms,thevisualizationprocess Visualizations are a useful means for exploration and may use the previous layout when computing the examination of data using visual representations or current layout. The capability of modelling incre- pictures. In many practical cases, visualizations are mental algorithms as well as dynamic graphs makes developed to serve for domain-specific tasks [76]. our visualization model more general than the van Examples of common tasks include, for example, Wijk model [78]. In this case, the general form of identifying important actors and communities in a visualization process becomes: social network, or exploring possible pathways in a biological network. To understand faithfulness, it is V :D×S×L→L, important to model these tasks. where the previous layout is the input for computing Generally speaking, tasks can be distinguished as the current layout. This is necessary, for example, “low-level” tasks [80] and “high level” tasks [36]. For when a layout algorithm attempts to preserve the example, Wehrend and Lewis [80] gives a list of pos- user’s mental map. However, unless otherwise stated sible low-leveltasks (e.g.,identify, locate,distinguish, we take the simpler model in equation (1). categorize,cluster,distribute,rank,compare,associate and correlate) that one can perform for data analysis. In Hibino’s study [36] of a data set of tuberculosis, 3.2 Perception seven high-level tasks include prepare, plan, explore, The perception process maps a picture from the layout present, overlay, reorient, and other. space L to the knowledge space K. We write this as This paper models all tasks as processes that map follows: the data space D, the layout space L, and the knowl- P :L→K (2) edge space K to a result space R. Note that, our graph visualization model differs from the van Wijk In this model we use the term knowledge – sometimes model [78] by the task model. The result space R called insight or mental picture – to denote the effect may be a simple boolean space {true,false}; or more on the human of his/her observation of the picture. commonly it is a multidimensional space, with each Again,intime-varyingsituation, thehuman’spercep- dimension modelling a separate subtask. tioncandependonthe previouspicture,andperhaps it is better to write: Tasksareoftenexecutedbyusersordataanalysers. However, tasks are not necessarily performed by the P :L×K →K same person, who creates the pictures from the data. However, we use the simpler model (2) unless other- Furthermore, tasks can be performed directly by ex- wise stated. tracting the answers from the data with or without using a picture of that data. Of course, it is difficult to formally model human knowledge or insight, and it is difficult to assess The central model for the task process is T = its value [61]. In particular, in some situations such (T ,T ,T ), where T , T and T are three func- D L K D L K JOURNALOFLATEXCLASSFILES,VOL.1,NO.1,JANUARY2016 6 tions: As an example, consider the classical barycenter visualization function that takes as input a planar T :D →R D graph G=(N,E), places nodes from a specified face TL :L→R on the verticesof a convex polygon, and placesevery T :K →R. othernodeatthebarycenterofitsneighbors(see[72]). K This function is information faithful on internally tri- We extend this common notion with two less com- connected (see [72]) planar graphs. However, if the mon notions T and T . These functions are more D L input graph is not internally triconnected, then same abstract; they do not take the users perception or picture can result from several input graphs (each knowledge into account. The function T returns D internal triconnected components is collapsed onto a a result directly from the data. Intuitively, one can line), and the method is not information faithful. imagine thatT is computed by a “dataoracle”,who D can extract a result perfectly from the data. Similarly, one can think of the function T as re- 4.2 Taskfaithfulness L turning a result directly from the picture. Again, one The intuition behind task faithfulness is that the vi- can imagine a “picture oracle”, who extracts perfect sualization should be accurate enough to correctly results from the picture. If the visualization mapping perform tasks. In terms of the functions V and T createsapicturethatisnotentirelyconsistentwiththe defined above, a visualization V is task faithful with data,thenitispossibleforthepictureoracletoreturn respect to specification s∈S if adifferentresultfromthedataoracle.Inthisway,the pictureoraclemaybelimitedbythefaithfulnessofthe TL(V(d,s))=TD(d) (3) visualization mapping. for every data item d∈D. In this model, all the details (such as questions) Ifavisualizationisinformationfaithful,thenclearly of a task are considered less important. It is simi- it is task faithful, assuming the picture oracle can ex- lar to how we do ignore the details (such as algo- tractallinformationfromthepicturetoperformtasks. rithms/mechanisms) ofavisualizationmethod.Thus, In many practical cases, such extraction may depend our model is not concerned with specific questions of on the perception skills of the viewers. However, the tasks. converse may not hold. This task model is perhaps over-simplistic; for ex- Consider, for example, a visualization function V cir ample, it does not model the a priori knowledge of that draws all nodes a graph G on the circle. Clearly, the human. In addition, the task processcanbe made V isinformationfaithful;Figure3aisanexample,in cir more general by taking some combination of data, whichwecanfindallnodesandedgesinthedrawing. layoutandknowledgeandthenmappingtotheresult Further,V istask-faithful:allthedataisrepresented cir space. However, the simple model is sufficient to in the drawing, and so all tasks can be performed demonstrate the concept of faithfulness - task faith- correctly using the drawing. fulness. We use this simple model of task throughout On the other hand, Figure 3b shows a graph draw- the rest of the paper. ingusingedgebundling.Considerthetasktoestimate the number of edges between two contiguous groups 4 FAITHFULNESS MODEL of nodes on the boundary of the circle. Another task Informally, a graph visualization is faithful if the un- is to determine if there is a link connecting groups derlying network data and the visual representation of nodes. The edge bundled drawings are certainly are logically consistent. In this section, we develop task-faithful for these tasks. However, it is not in- this intuition into a semi-formal model. In fact, we formation faithful, as the original graph is no longer distinguish three kinds of faithfulness: information reconstructable from the bundled layout. faithfulness, task faithfulness, and change faithfulness. Then we discuss the difference between faithfulness 4.3 Changefaithfulness and correctness, and then between faithfulness and The intuition behind change faithfulness is that a readability. change in the visual representation should be consis- tent with the change in the original data. Note that 4.1 Informationfaithfulness this is a different concept to the mental map [18] The simplest form of faithfulness is information faith- or stability [63]; while these concepts are concerned fulness. This is based on the idea that the visual with the user’s interpretation of change, the concept representation of a data set should contain all the ofchangefaithfulnessisconcernedwiththegeometry information of the data set, irrespective of tasks. In of change. termsofthenotationdevelopedabove,avisualization Change faithfulness is important in dynamic set- V is information faithful if the function V is injective, tings, such as interactive or streamed graph drawing. that is, V has an inverse. However,itisalsovalidinstaticsettings, becausethe JOURNALOFLATEXCLASSFILES,VOL.1,NO.1,JANUARY2016 7 than the graph size. In such cases, faithfulness is sometimes be sacrificed for readability. In specific domains, there are important tasks for which the visualization can be both readable and task-faithful. 4.4.2 Faithfulnessanddeterminism Faithfulness is a different concept than the idea of determinism. Determinism refers to the requirement that applying the same visualization on the same Fig.6. InteractiongroupsbetweenHealthresearchers graph should give the same or similar results. For intheEuroSiSdataset example, tree drawing algorithms are deterministic, difference between two pictures should be consistent whereas other graph drawing algorithms, such as with the difference between the two data items that spring-embedder, are non-deterministic. they represent. Determinism is an important criterion for visual Consider, for example, a function V that visu- exploration and navigation of graphs. However, a groups alizes the interaction networks d that occur between visualization method may either be deterministic or European Science in Society researchers in Health3. non-deterministic, while aiming for faithfulness. Suppose thatV usesa forcedirectedalgorithm groups to draw the connected components of a graph d∈D 5 FAITHFULNESS METRICS separately, and arranges these components horizon- tallyacrossthescreen,asinFigure6.NotethatVgroups Faithfulness is not a boolean concept; a visualization isinformation faithful.However, Vgroups is not change method maybe alittle bitfaithful,butlessthan100% faithful, because a small change in the graph d (such faithful. Often, one may not tell whether a picture is asaddingan edge)canresult in a largechange in the absolutelyfaithfulor unfaithful.Butone cancompare picture. if a picture is more faithful than the other picture of the same data. 4.4 Remarks Theclassicalconceptofreadabilityofagraphdraw- ingcanbeevaluatedusinganumberofmetrics.These 4.4.1 Faithfulnessandreadability include, for example, the number of edge crossings, Moreimportantly,faithfulnessisadifferentconceptto the number of edge bends, and the area of a grid readability. Readability is well studied in the Graph drawing.Thesereadabilitymetricsareformalenough Drawing literature. It refers to the perceptual and that the problem of constructing a readable graph cognitive interpretation of the picture by the viewer. drawing can be stated as a number of optimization Readability depends on how the graphical elements problems; thus optimization algorithms can be used. are organized and positioned. It does not depend There is not really a clear winner amongst the met- on whether the picture is a faithful representation of rics, and there is not a single readability metric. For the data. We can divide the readability concept into example, edge crossing number does not equal to the three subconcepts in the same way as we divided graph readabilityas a graph with one more crossings faithfulness: is not necessarily “less readable”. 1) Information readability: A drawing is information- We aim to create a list of faithfulness metrics that readable if the perception function P is one-one; play the same role. Yet we should emphasize that the that is, if two pictures appear the same to the faithfulness metrics proposed in this section are just user, then they are pictures of the same graph. the examples to show the “possibility”; they are not Effectively, this is saying that all the information the only ways to define measurement of the faithful- from the picture can be perceived by the human. ness concepts. Furthermore, we do not aim to create 2) Task readability: A visualization is task-readable for a generic “faithfulness number”, but we try to show a task T = (T ,T ,T ) if T (P(ℓ)) = T (ℓ) for D L K K L how quantitative measurements of the faithfulness every layout ℓ∈L. concepts are possible. 3) Change readability is the classical mental map. In this section we develop a framework for such A good graph visualization method should achieve metrics. We assume that the spaces D, L and R each bothfaithfulnessandreadability;inpractice,however, has a norm, which we generically denote by k·k. there may be a trade-offbetween the two ideals. This • First, each data item d in data space D can be isespeciallytruewithlargegraphs,whenthedatasize modelledinmultipledimensionalspaceA ×A × 1 2 is too large for the visualization to be information- ···×A , where A is the i-th attribute dimension. i i faithful; indeed, the number of pixels may be smaller A norm in data space D can be Euclidean norm (k · k ), or Manhattan norm (k · k ), or p-norm 3.Available at http://wiki.gephi.org/index.php/Datasets. Data 2 1 useinthisexamplearefilteredforHealthresearchersonly. (k·kp). JOURNALOFLATEXCLASSFILES,VOL.1,NO.1,JANUARY2016 8 • Second, norms in result space R can be defined 5.3 Anexampleofchangefaithfulnessmetrics in a similar manner of the norm in data space D. Tufte[77]definesthe“lie-factor”astheratioofchange • Third, for layout space L, a definition of norms inagraphicalrepresentationtothechangeinthedata. can be a bit more complex. Layout space L can WecanexpressTufte’sconceptintermsofourmodel. bemodelledasR3×C1×···×Ci,whereCi isi-th Then, for a visualization V with specification s ∈ S, dimension of visual attributes. In a visualization, the liefactorfortwodistinctdatad,d′ ∈D withd6=d′ visual attributes of nodes include location, color, is defined by: shape, transparency, texture, etc; while visual at- lie(d′,d,ℓ′,ℓ)=∆(ℓ,ℓ′)/∆(d′,d), tributesofedgesmayfurtherincludeedgebends. Further, we assume that each of these spaces has where ℓ = V(d,s) and ℓ′ = V(d′,s). Tufte’s aim is to a distance function ∆ that assigns a positive real measurethequalityofstaticvisualizationsintermsof number ∆(a,b) to each pair a, b of elements of the the lie-factor, but we can apply the same principle in space. a dynamic setting. In the ideal case (no “lie”), the lie metric has the value of 1. 5.1 An example of information faithfulness met- Intuitively, the lie factor increases as change faith- rics fulness decreases, and so for two “distinct” data ele- The simplest way to measure the information faith- ments d′ and d we can measure the changefaithfulness fulness of a graph visualization function V with a (normalized) as: specification s, is to measure its “ambiguity”. f (d′,d,ℓ′,ℓ)=exp(−lie(d′,d,ℓ′,ℓ)) (7) change For each data d and visualization ℓ = V(d,s) ∈ L, letV−1(ℓ)denotetheset{(d,s)∈D×S :V(d,s)=ℓ}. Thevaluesofthischangemetriccanvaryfrom1(very Let|V−1(ℓ)|denotethenumberofelementsinV−1(ℓ). change-faithful) to 0 (change-unfaithful). ThentheinformationfaithfulnessofV isafunctionf , info defined by 6 EXAMPLE 1: MULTIDIMENSIONAL SCAL- 1 ING AND FORCE DIRECTED APPROACHES f (d,ℓ)= . (4) info |V−1(ℓ)| This section discusses the multidimensional scaling The metric can have values ranging from 1 (very (MDS) [8] and force directed approaches to Graph faithful) to 0 (unfaithful). Drawing [16], [22], [24], [28], [39] in terms of faith- A more subtle approachis to measure the informa- fulness. tionfaithfulnessofagraphvisualizationfunctionV as The MDS approach to Graph Drawing works as the information loss in the channel. The loss of infor- follows. The input is a graph G=(N,E), and a |N| × mation during the visualization process is defined as |N| matrix of dissimilarities δ . The goal is to map u,v entropy in information theory. The information loss is each node u ∈ N to a point p ∈ Rk such that the u easiertomeasurethanthetotalinformationcontentof given dissimilarities δ are well-approximated by u,v adataset.Thereareseveraltechniquesformeasuring thedistancesd =kp −p k.Thesetofpointsp forms u,v u v u information content and information loss (for a full the layout ℓ=V(G) of G. In practice, k is commonly discussion, see [67]). 2 or 3. In most applications, δ is chosen to be the u,v graph theoretic distance between nodes u and v. 5.2 Anexampleoftaskfaithfulnessmetrics To measure the success of an MDS function, a We can measure task faithfulness as the difference “stress” [49] function is commonly used to compute betweentheresultfromthedatadandtheresultfrom the distortion between dissimilarities δu,v and fitted the visualization ℓ=V(d,s). The task faithfulness of V distances du,v. In the simplest case, the stress of a is a function f , defined by: layout ℓ∈L is: task ∆task(d,ℓ)=∆(TL(ℓ),TD(d)) (5) stress(node)(ℓ)= X(δu,v−du,v)2, (8) u6=v with respect to the task T =(T ,T ,T ) and specifi- D L K MDS can be seen as an optimization problem where cation s ∈ S. One could define a normalized version the goal is to minimize this stress function. Force of f as: task directedalgorithmshaveasimilar flavour,butviewthe ftask(d,ℓ)=1/(∆task(d,ℓ)+1). (6) problem as finding equilibrium in a system of forces. Themetriccanhavevaluesrangingfrom1(verytask- 6.1 Informationfaithfulness faithful) to 0 (task-unfaithful). Our task faithfulness metric is keptsimple enough; For most MDS approaches, there is a likelihood that we neither take the transformation from data to task vertices overlap in the “optimal” layout. In these result, nor the interpretation of the users from the cases, it is not information faithful, and different drawing to task result. In fact, different users may MDS methods would produce the same result from perceiveadrawingdifferentlyinmanypracticalcases. different data sets. JOURNALOFLATEXCLASSFILES,VOL.1,NO.1,JANUARY2016 9 6.2 Taskfaithfulness and validated task faithfulness goals with respect to tasks that depend on graph theoretic distances. We The stress formula (8) can be seen in terms of our suggest that better MDS methods could be designed task faithfulness framework. Suppose that T is a task by optimising their change faithfulness using the lie thatdependsonthe graphtheoreticdistance between factor stress above. nodes;letRbethesetofreal-valuedmatricesindexed on the node set. For a graph G=(N,E) ∈D, let T (G) D be the matrix [δu,v]u,v∈N. Suppose that the visualiza- 7 EXAMPLE 2: EDGE BUNDLING tion V places node u at location p ; let T (V(G)) be u L Edge bundling, as illustrated in Figure 3b and 4b, the matrix [d ] . Then define u,v u,v∈N has been extensively investigated to reduce visual f(node)(G)= max kT (V(G))−T (G)k , (9) clutter in graph visualizations. Many edge bundling task G∈Dn L D 2 techniques have been proposed, including hierarchi- cal edge bundling [37], geometry-based edge cluster- whereD isthesetofgraphsofsizenandk·k isthe n 2 ing[37],[52],[81],force-directededgebundling[38]and Frobenius norm. Clearly, minimising taskfaithfulness multi-levelagglometiveedgebundling[27],justtoname is equivalent to minimising the stress defined by (8). a few. The US airline networks are typical examples used 6.3 Changefaithfulness fordemonstrationofedgebundlingmethods.Figure7 Further,wecanevaluatethechangefaithfulnessofan shows several edge bundling results. MDS method. In fact, MDS methods have been used Edge bundling seems to increase task readability extensively in dynamic settings, using stress to pre- with respect to some tasks; for example, the classic servethementalmap.Supposethatattimet,wehave bundling of air traffic routes in the USA (see [38], a graph G(t), and the visualization function places [74]) seems to make it easier for a human to identify node u at point p(ut) at time t. A stress function can the main hubs and flight corridors. However, some calculate the difference between the layout ℓ(t) ∈L at readability metrics are sacrificed; for example, the time t and the layout ℓ(t′) ∈L at an earlier time t′: number of bends is increased, making individual paths difficult to follow (the authors are not aware stress(node)(ℓ(t),ℓ(t′))= X kp(ut)−pu(t′)k2. of any human experiments that measure readability u∈N for edge bundling). In this Section we make some In the so-called “anchoring” approach, t′ is zero; in remarks about the faithfulness of edge bundling. the “linking” approach, t′ is the previous time frame before t (see [6]). These measures, however, aim for 7.1 Informationfaithfulness the mental map preservation - or change readability Edge bundling reduces information faithfulness: as - rather than change faithfulness. For example, they moreedgesarebundledtogether,itbecomesharderto aim to ensure that if the change in the graph is small, reconstruct the network data from a bundled layout. then the change in the layout is small. They do not We can propose a rough-and-ready metric for this ensure that if the change in the graph is large, then reduction based on the model presented in Section 5. the change in the layout is large. Given an input graph G = (N,E), an edge bundling However, we can use the stress approach to define visualization process V partitions E into bundles E = the lie factor, such as: B ∪B ∪...∪B .LetG denotethesubgraphofGwith 1 2 k i lie(d(t′),d(t),ℓ(t′),ℓ(t))= Pu6=vkk((dδ(u(tt,))v−−δd((utt,′′v))))//δd(u(tt,′′v))kk eeddggeesseintBBii.aEnddgneobduensdeltinNgimcoentshiostdinsgenosfuernedtphoaitnGtsioisf Pu6=v u,v u,v u,v bipartite; suppose that N =X ∪Y is the bipartition i i i where d(ut,)v = kp(ut)−p(vt)k and δu(t,)v denotes the graph of Gi. In the bundled layout, Gi is indistinguishable theoretic distance between u and v in G(t). from a complete bipartite graph on the parts Xi and Y . This representation has inherent information loss. Then the normalized change faithfulness can be i Let x = |X |, and y = |Y |. The number of (labelled) measuredintermsofthedistortionofthedatachange i i i i bipartite graphs with parts X and Y is 2xiyi. Thus if relative to the layout change: i i q = k x y , then there are 2q graphs that have the f (d(t′),d(t),ℓ(t),ℓ(t′))=exp(−lie(d(t′),d(t),ℓ(t′),ℓ(t)))samePlai=y1ouitaisG.Thiscanbeusedasasimplemodel change for computing the information faithfulness of V. 6.4 Remarks 7.2 Taskfaithfulness The force directed and MDS approaches has had considerableimpactonthecommercialworld,despite Most bundling methods use a compatibility function; the fact that they do not have explicit or validated roughly speaking, a compatibility function C assigns readabilitygoals. We believe that the success of these a realnumber C(e,e′) to eachpair e, e′ of edges. Two approaches is due to the fact that they have explicit edges e and e′ are more likely to be bundled together JOURNALOFLATEXCLASSFILES,VOL.1,NO.1,JANUARY2016 10 (a)FDEB[38] (b)IBEB[74] Fig.7. ExamplesofUSairlinenetworkvisualizationsusingedgebundling (a)cyclevs.stressratio (b) number of control points vs. stress ratio Fig.8. Comparisonsofthefaithfulnessoftheedgebundledworldcupvisualizations if the value of C(e,e′) is large. A number of compati- methods. For example, one can test the following bility functions have been proposed and tested; these hypotheses: includespatialcompatibility[38]fromlength,position, Hypothesis 1: The force-directed edge bundling angle and visibility between edges, semantic compati- (FDEB) [38] methods and its force-directed vari- bility,connectivitycompatibility,importancecompatibil- ants [47], [60], [70] are task-faithful. ityandtopologycompatibility.Someofthesefunctions Hypothesis 2: Force-directed edge bundling meth- dependonly onthe inputgraphG, andsome depend ods are more task-faithful when using more control also on the layout of G. points per edge. Foranumberoftasks,suchasidentifyinghubsina We have conducted several experiments with the network, highly compatible edges are equivalent; the faithfulness metrics. We use the worldcup data4 as correct performance of such tasks does not depend our examples. Our initial studies using the metrics of distinguishing between them. Here we show that above have shown a confirmation of the above hy- stress functions can be used to define metrics for potheses. Figure 8 shows statistics of stress values computing the task faithfulness of the edge bundled at different iteration cycles of force-directed edge layout relative to such tasks. bundlingalgorithm(FDEB)[38].Thefigurehasshown Given a pair of edgese and e′ in an inputgraphG, that FDEB achieves more faithful results when more letC(e,e′) denote their compatibility. We assume that number of iterations are performed. Furthermore, in this compatibility function depends only on G and later iterations, the result becomes more faithful as not on its layout. Let ℓ ∈ L be the layout of G. For the algorithm uses more control points to improve its twoedgeseande′,letd(e,e′)bethedistancebetween layout. the curves representing e and e′ in ℓ. We can choose from a number of distance functions for curves, such 8 EXAMPLE 3: VISUALIZATION METAPHORS as the Fre´chet distance and several distance measures for point sets. The stress in ℓ is then defined as: This section discusses our new notions of faithful- ness for several representative graph visualization stress(edge)(ℓ)= X(C(e,e′)−d(e,e′))2. (10) metaphors. e6=e′ 8.1 Matrixrepresentation 7.3 Remarks Besides node-link diagrams, visualization of graphs Despite the plethora of recent papers in edge as matrices form is also popular. In fact, matrix and bundling [44], there are few evaluations of effective- node-link diagrams have different characteristics and ness.Usingtheformalmodelsoutlinedabove,onecan begin to evaluate faithfulness and compare bundling 4.Dataavailableathttp://gd2006.org/contest/WCData/

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.