Applying representational state transfer (REST) architecture to archetype-based electronic health record systems Erik Sundvall, Mikael Nyström, Daniel Karlsson, Martin Eneling, Rong Chen and Håkan Örman Linköping University Post Print N.B.: When citing this work, cite the original article. Original Publication: Erik Sundvall, Mikael Nyström, Daniel Karlsson, Martin Eneling, Rong Chen and Håkan Örman, Applying representational state transfer (REST) architecture to archetype-based electronic health record systems, 2013, BMC Medical Informatics and Decision Making, (13), 57. http://dx.doi.org/10.1186/1472-6947-13-57 Licensee: BioMed Central http://www.biomedcentral.com/ Postprint available at: Linköping University Electronic Press http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-87696 Sundvalletal.BMCMedicalInformaticsandDecisionMaking2013,13:57 http://www.biomedcentral.com/1472-6947/13/57 SOFTWARE Open Access Applying representational state transfer (REST) architecture to archetype-based electronic health record systems Erik Sundvall1*, Mikael Nyström1, Daniel Karlsson1, Martin Eneling1, Rong Chen1,2 and Håkan Örman1 Abstract Background: The openEHR project and the closely related ISO 13606 standard have definedstructures supporting thecontent of Electronic Health Records (EHRs). However, there is not yet any finalized openEHR specification of a service interface to aid application developers in creating, accessing, and storing the EHRcontent. The aim of this paper is to explore how the Representational State Transfer (REST) architectural style can be used as a basis for a platform-independent, HTTP-based openEHR service interface. Associated benefits and tradeoffs of such a design are also explored. Results: The main contribution is theformalization of the openEHR storage, retrieval, and version-handling semantics and related servicesinto animplementable HTTP-based service interface. The modular design makes it possible to prototype, test, replicate, distribute, cache, and load-balance thesystem using ordinary web technology. Other contributions are approaches to query and retrieval oftheEHR content thattakes caching, logging, and distribution into account. Triggering on EHRchange events is also explored. A final contribution is anopen source openEHR implementation using theabove-mentionedapproachesto create LiU EEE, an educational EHRenvironment intended to helpnewcomers and developers experiment with and learn about thearchetype-based EHRapproach and enable rapid prototyping. Conclusions: Using REST addressed many architectural concerns in a successful way,but an additional messaging component was neededto address some architectural aspects.Many ofour approachesare likely of value to other archetype-based EHR implementationsand maycontribute to associated service model specifications. Introduction One barrier to understanding openEHR is its size; the There are several intertwined motivations behind this complete openEHR specification documents are com- work that together form the requirement context and prehensive, see Figure 1. The size is justified by the fact resulting design philosophy. They stem from years of that the approach tries to cover the semantic underpin- using, teaching [1], researching, and implementing sys- ning for systems ranging up to and beyond nation-wide tems [2-4] based on openEHR and archetypes. Student EHR networks for lifelong records [5]. A functional, project participants, clinicians, and software developers transparent, and manageable EHR system on which to approaching parts of openEHR often need a lot of time perform teaching and research in projects related to andefforttoget started (Figure 1). openEHR and ISO 13606 was needed when the design The above-mentioned motivations lead to require- and implementation of this architecture started. The ments for an architecture intended to clearly separate result had to be straightforward enough to allow new concerns, and that should be open, modifiable, scalable, people to get started with relevant parts of the system in andatthesame timesuitableforrapid prototyping. short time, for example in Master’s thesis projects or studies aiming for a single scientific paper. A modular design that considers ‘separation of concerns’ by using *Correspondence:[email protected] 1DepartmentofBiomedicalEngineering,LinköpingUniversity, ‘pluggable’ components of limited functional scope, in Linköping58185,Sweden combination with examples with reasonable learning Fulllistofauthorinformationisavailableattheendofthearticle ©2013Sundvalletal.;licenseeBioMedCentralLtd.ThisisanOpenAccessarticledistributedunderthetermsoftheCreative CommonsAttributionLicense(http://creativecommons.org/licenses/by/2.0),whichpermitsunrestricteduse,distribution,and reproductioninanymedium,providedtheoriginalworkisproperlycited. Sundvalletal.BMCMedicalInformaticsandDecisionMaking2013,13:57 Page2of25 http://www.biomedcentral.com/1472-6947/13/57 pose the necessary counter-questions regarding how source data can be structured. Structuring has a price as it requires effort both at design time and later at data entry, and that price should be put in proportion to pos- sible gains. A goal is to have these price versus perform- ance discussions early, for example during cooperative prototyping (described by Bødker et al. [6]) and thus help in creating a more complete basis for design than what is often produced in paper-only requirement speci- fications. Ideally, the proposed architecture should make it possible and efficient to perform many experiments in order to support interaction design, prototyping, etc., processesthatusually areiterativeandagile. Figure1Prototypeplatformexamples.Thedescribed Iterative experimental prototyping and development of architectureaimstoreducethetimeneededtoimplement prototypesbasedonopenEHRandotherarchetype-basedsystems. interfaces and services using the proposed REST-based Ithasnotbeenobviousfornewcomershowtogofrom architecture can now (at publication time) bedone using comprehensivespecifications(thebinder)andarchetypesto LiU EEE (Linköping University Educational EHR Envir- workingprototypes(asonthedevicesabove). onment). Appendix A, “LiU EEE – Implementation details”,describes implementation detailsofLiUEEE. curves (using for example HTML and JavaScript) may LiU EEE is currently used for research and educational helpnewcomersbecomeproductive fast. purposes. Server side features are mainly written in Java The approach of dividing the openEHR architecture and an XML database [7] is used for storage. Client side into functional subcomponents that ease learning and features were created mainly in HTML5 and JavaScript. enable decoupled implementation also has potential The software is available as open source under an Apa- deployment and scalability implications for full scale che 2license. EHR systemsthatrequire consideration of: Background (cid:1) Openly specified flexibledeployment scenarioswith This background section describes components and the partial implementations andsolutionsfrommixed, general architectural approaches reused in this work In- diversified developersand suppliersbackedby cluding Representational State Transfer (REST), URIs, different technologies(any HTTP-capable platform/ andHTTP. programminglanguage,etc.)Thusfacilitating ‘mix Appendix B, “Introduction to openEHR”, is provided andmatch’ofuserinterface,decisionsupport, as recommended reading for readers unfamiliar with storage,and querysolutions,etc.,betweenopenEHR details of archetype based systems such as openEHR and implementations. ISO 13606. It describes openEHR features of importance (cid:1) Howto applyscalabilitysolutions,including the to the proposed REST-based architecture that need to onesbuilt intoRepresentationalStateTransfer beunderstoodbefore continuedreading.This includes: (REST)andHTTP,toopenEHRin‘scale-out’ scenarios,wherepartitioningthesystemcleverly and (cid:1) Designlayers:Reference Model (RM), Archetypes adding more computersscalesupthe loadcapacity. andTemplates[5]. (cid:1) Versioning mechanismsand the centralobjects Another design goal was to create an architecture that CONTRIBUTION,VERSIONED_OBJECTand allows and encourages rapid prototyping and end user VERSION[8]. innovation, regarding for example user interfaces (in- (cid:1) Paths,queries and theArchetypeQuery Language cluding EHR navigation and graphical overviews) and (AQL)[5,9,10]. decision support, based on realistic patient data and archetype-based data structures. A goal is to take de- The design this paper describes treats EHR infrastruc- signs and design stakeholders beyond nice but untested ture as a distributed mission-critical system, but the mockups that at implementation time may show to be paperdoesnotdiscusstowhatdegreeEHRsystemshave incapable of coping with real data, since the required (yet) become indispensable, orhow healthcareisaffected source data is not structured and well defined. Many when EHR systems become unresponsive. Thus the clinicians have well-justified wish lists regarding what an background section ends by describing approaches to optimal EHR system should do. Prototyping processes three desired properties of such systems: scalability, based on archetype-based frameworks, should preferably performance,andhigh availability. Sundvalletal.BMCMedicalInformaticsandDecisionMaking2013,13:57 Page3of25 http://www.biomedcentral.com/1472-6947/13/57 RepresentationalStateTransfer(REST),URIs,andHTTP After the optional user info part, the http scheme, RoyFielding[11]describedconstraintsanddesigndecisions illustrated in Figure 2, specifies which server and port to behind the World Wide Web as an architectural style pat- connect to. The rest of the http scheme URI could be tern named Representational State Transfer, or REST for pointing to a specific file in a directory specified by the short,thatwasappliedtothedesignoftheHypertextTrans- path, but it could also just serve as an identifier sent to fer protocol (HTTP) [12] and Uniform Resource identifiers the server that it can process in other ways, for example (URI)[13]. map todatabaseentries. The REST approach aims to support distributed hyper- The fragment part (after ‘#’) is intended for the client, mediasystemsthatscaletoInternetsize,withgeneralinter- not the server and if just the fragment is changed, no faces and intermediary components that reduce interaction newcallneedstobemadetotheserver.Client-side code latency[11]. (e.g. JavaScript) sometimes uses the fragment to keep Some of the fundamental parts of REST when applied track of internal state and to make bookmarking and the toHTTPandURIsare: browser back button act as expected (by reversing to previous views even when the previous view was locally (cid:1) Resourcesaretargetsofreferencesand areidentified generated andnotfetched from theserver). byresourceidentifiers intheform ofURIs URI templates [14] can be used to specify URI (cid:1) Resourcescanhavedifferentrepresentations structures with variable parts enclosed within curly (formats),e.g. HTML,XML,JSON(JavaScript brackets, as in http://ehr.imt.liu.se/ehr:{ehr_id}/{object_id} ObjectNotation),plaintext,serializedJavaobjects. @{versionLookup} Therepresentations aresentinthe bodyofthe message.IntheHTTP protocolrepresentation SolutionsanddesignpatternscomplementingREST metadata sentintheHTTPheadercontainsthings The REST design pattern over HTTP is not optimized likemedia(MIME)type.HTTPheaderfieldscan for server-initiated interaction or frequent update notifi- alsocontaincache directives,asexemplifiedinthe cations [15], which may also be needed in an EHR sys- ‘performance’sectionbelow. tem. Thus additional design patterns and solutions are (cid:1) Apredefined setofmethods(corresponding to helpful, but the difference in scalability properties needs verbs)thatactonresources.Themethodissupplied to be considered when selecting where and how to apply inthe client’srequest.Methods likeOPTIONSand them. GETshouldnotleadtochangesattheserverside, soresultsfrom thosemethods canusuallybe Event-drivenmessagingsystems cached. TheGET method isused toretrievea The implementation section about trigger handlers de- representationoftheresource thatthe URI scribes how message-processing systems (message bro- identifies,which ishowmost normalwebpages are kers) are used in the architecture to report EHR events fetched. ThePUT method replacescontentatthe (e.g. content updates) to interested subsystems like deci- targetURI(orcreatesitifmissing),andDELETE sion support systems and systems copying or sending deletesresources.ThePOST methodisoftenused the newly added data. Message brokers typically have tomodifyor addinformation, ortoperform other queuesortopicsthatcanbepublishedandsubscribed to operations determinedbytheserver. using paths. Our approach aims to keep messaging (cid:1) Aresponsecontainsheaders,aHTTPStatusCodeand simple and trigger-oriented; if needed, subscribers can possiblyabody.Statuscodesindicatethingslike interact and fetch further details via HTTP after being success(including‘200OK,’‘201Created,’etc.), notified of events. Many message brokers include redirection(including‘301MovedPermanently,’‘303See wildcards and regular expressions for selection of topic Other,’‘304NotModified’),errorsintheclientrequest pathandqueuepathsothat subscribersonlygetnotified (includingthefamiliar‘404Notfound’andthe‘412 of things they are interested in. Many brokers also offer PreconditionFailed’thatcanbereturnedwhenconflicts filtering based on header content in addition to topic blockconditionalupdates),andservererrors(including path. Brokers can also handle conversion between mes- ‘500InternalServerError’).Responsesindicating saging protocols and take care of authentication and redirectionorcreationofnewresourcesshouldcontain authorization. In addition to programming language a‘Location’headerfieldwiththetargetURI. APIs, some brokers can also interact via RSS, instant messagingprotocols,andpartlyviaHTTP. A Uniform Resource Identifier [13] identifies an abstract or physical resource. The scheme (for example Sockets http, ftp, mailto, urn) indicates how the rest of the URI For small, frequent notifications between components, istobeinterpreted(Figure 2). or between client and server, the POST and GET Sundvalletal.BMCMedicalInformaticsandDecisionMaking2013,13:57 Page4of25 http://www.biomedcentral.com/1472-6947/13/57 Figure2URIComponents.ThecomponentsofaURIusingthehttpidentifierscheme. requests have unneeded features that instead become (cid:1) Consistency,orratherhaving ‘atomic’transactions overhead. A more efficient approach is WebSockets [15] acrossallinvolvednodesso thata series ofactions that uses the upgrade mechanism of HTTP to open a areeitherperformedlogically asonesingle bidirectional communication channel over a single TCP operation ornotatall socket. (cid:1) Availability,meaningthatthesystem isresponsive sothateveryrequestgetsaresponseindicating Scalability-theabilitytoscale failureorsuccess Two commonly used scalability approaches are often (cid:1) Partition tolerance,meaningthatthe systemkeeps referred toasvertical andhorizontal[16](chapter9): working evenifsome network messagesdisappear, forexample dueto nodes crashingor being (cid:1) Vertical,scale‘up’or ‘getbigger’means getting temporarilycutofffromthe network bigger(often expensive) computerswith more processingpowerand storage when demand According to Gilbert et al. [19], you can only guaran- increases.Forexample, thisapproachcanbe teetwo oftheaboveatthe same time. justified byalackoftimetorewriteexisting codeto workforhorizontalscaling. DistributeddatabasesystemsandMapReduce (cid:1) Horizontal,scale ‘out’or ‘getmore’meansgetting An alternative to handling distribution at the application morefairly ordinaryserversthat arenotnecessarily level via sharding is to let some storage infrastructure, topoftheline,butratherselectedforgoodprice/ for example a distributed database, automate the distri- performanceratio. Horizontalscalingisoften bution so that the complexities of distributed transac- needed tokeep costsdown,butrequiresthatthe tions, replication, etc. are hidden to the calling applicationisdesignedtoruninadistributed application. The limitations of the CAP theorem are still environment.Horizontalscalingcaninvolve valid, but automated solutions often offer tuning alterna- sharding orMapReduceapproaches andneedsto tives to prioritize between C, A, and P. Since some considerthe CAP-theorem,seebelow. databases use ‘snapshot isolation’ and ‘multiversion con- currency control’ (MVCC) [21] to reduce the number of Sharding potentially performance-degrading locks, it is worth Sharding databases means splitting a database into dif- noting that the openEHR time-stamped ‘append only’ ferent partitions that are usually put onto different versioning system provides a snapshot functionality via servers. The effect is that one big database becomes sev- theVERSIONobjects. eral smaller ones; thus, performance can be increased Many distributed database systems are capable of run- since the index used for lookups gets smaller and the ning MapReduce [22] tasks that are suitable for batch number ofrequeststoeachserverisreduced [17,18]. processing of massive amounts of data in distributed en- Efficient sharding requires a fast way for processes to vironments by splitting up big tasks into smaller chunks figure out which shard (and thus server) to send the that can be processed in parallel by many computers. requestto(seediscussionsection fordetails). For EHRs this is likely useful for example in epidemi- ology research. TheCAPtheorem Brewer’s CAP theorem, detailed by Gilbert and Lynch Performance,caching,andreducingnumberofrequests [19] and further illustrated by Browne [20], states the Fielding states: ‘An interesting observation about following three desirable properties can not be achieved network-based applications is that the best application atthesame timeinadistributedsystem: performance is obtained by not using the network’ [11] Sundvalletal.BMCMedicalInformaticsandDecisionMaking2013,13:57 Page5of25 http://www.biomedcentral.com/1472-6947/13/57 (section 2.3.1.3). Designs can strive to minimize the is shared among computers, with the size equal to the number of calls for example by making a few larger sum of available cache memory in all the attached com- requests rather than many small. Also, by caching (tem- puters.Typically,suchacacheisusedtostoresmall,fre- porarily storing) already fetched data many calls can be quently used data elements cleverly so that they do not avoided. The cached resources could be stored in mem- havetobefetched fromdiskordatabaseasoften. ory (fast) and on disk (more space available). The local cache of a web browser is familiar to many users, and Highavailability there can also be proxy servers in the network caching Striving for high availability means keeping the system the request on the way to the client. On the server side, up and running constantly by avoiding single points of repeatedly requested data can be cached in order to failure. Distributed databases and storage systems often reduce calls to disk storage (e.g. databases) and to reuse have built in replication options to prevent data loss in already performed server-side data processing and case of server breakdowns. If sharding is done at appli- compilation. cation level, similar approachescould be considered. The headers of the HTTP/1.1 protocol [12] provide many ways to communicate cache-related information Implementation that HTTP servers and clients should obey. Some exam- This section details a suggested way to componentize plesare: openEHR-based systems – mainly HTTP-based and (cid:1) The‘expires’header(using anabsolutedate)and programming-language-independent. The technical ‘cache-control’headerslike‘max-age’(using ranges interface design suggestions can be used as input to future openEHR service specifications. This can be done inseconds) canbeusedto indicate when acached intheformofanopenEHRRESTImplementationTech- responseshouldberefreshed. (cid:1) Theheaders ‘ETag’and‘last-modified’indicateifor nologySpecification(ITS). when the content ofaresourcehasactually been changed.Theycanbeused tocomparealready Componentizationandseparationofconcerns fetched data towhatiscurrentlyontheserver,and The division of components was partly guided by the dependingonthatinformation,direct howarequest VERSIONED_OBJECT and CONTRIBUTION lifecycles shouldbehandled.The ‘ETag’headercontainsa detailed in the openEHR specification [8]. It was also string that shouldbeproducedbytheserverina guidedbywhatwasconsideredtobesuitable subsystems waythatitischangedwheneverthe content ofthe with separation of concerns for developers with different resource atthat URIchanges.TheETagstring needs interests. In a real production system, these components onlytobeunique relatedtodifferent versions ofthe are possibly sourced from different groups or vendors sameresource,butthereis noproblem ifseveral specializedinfor example: resources usethe sameETag.The‘last-modified’ header hasaresolution ofwhole seconds and may (cid:1) Userinterfaces and interaction thusnot show differences ifaresourcehasbeen (cid:1) Decisionsupportand otherservicestriggeredby updatedmorethanonceduringa second. added EHRcontent (cid:1) Theheader ‘If-None-Match’,sent fromthe client (cid:1) Datavalidationand conversionbasedonarchetypes, combinedwiththeETag valueofapreviously etc. cached response,canbeusedtomake conditional (cid:1) Storage and querying (achieving distributionand GETrequests.Theserverwillonlyservethepage if performance) thefieldsdon’tmatch. Ifthey matchtheserver sendsthe HTTPstatuscode‘304notmodified’. Some components obviously overlap categories. In the (cid:1) Ina similar way,aconditionalPUTcanbe proposed REST approach, many foundational openEHR performed ifthe client sends theheader ‘If-Match’ RM objects are available as separate resources identified forexample tocheckthatatargetresourceitwants by URIs (e.g. objects of type CONTRIBUTION and toupdate hasnotalready beenupdatedby VERSIONED_OBJECT). Figure 3 shows components somebodyelse. andcomponentrelationships. The component communication interfaces are mostly Inadistributedsystem,alsofastin-memorycachedata based on HTTP, with the addition of some event-driven can be shared among networked servers. One approach messaging via service brokers and caching via the is to use caching systems compatible with the Memcached protocol. Thus even components using a Memcached protocol [23]. To the programs calling the mix of environments and programming languages cache, it looks like a big key-value store (hashtable) that should be straightforward to integrate. If there is Sundvalletal.BMCMedicalInformaticsandDecisionMaking2013,13:57 Page6of25 http://www.biomedcentral.com/1472-6947/13/57 Contribution Builder (CB CB Trigger Handler CBTriggerlistenerexamples: CBTemporaryStorage "Offtheshelf" Interactivedecisionsupport MessageBrokers duringdataentry ArchetypeandTemplateRepository ValidatorsandConverters ContributionTriggerHandler Client InstanceBuilder EHRMetadataCache AnyMemcached ContributionTriggerlistenerexamples: compatible -Decisionsupportatcommitorsigning. key-valuestore -Replicationtootherdatabases Router Contribution HTTP(S) R/Winterfaces EHRDatabase VersionedObject QueryExecutors Ifnewstoragerepresentation Query formatsareadded,thenthe componentsneighbouring Log QueryTranslators thisnote(plus"Validators QueryStorage andConverters")willneed tobeadapted. <<utility>> Bookmarks BulkLoader DirectoryandDemographics Figure3REST-basedcomponentizationofopenEHR.ThemaincomponentsofthesuggestedREST-basedapproachandtheLiUEEE implementation.Thedesignisflexibleregardingcomponentdeployment.Inasmallorexperimentalsetting,allservercomponentscanrunin thesameJavavirtualmachineonasinglecomputer,andinafulldeploymentcomponents(andseveralinstancesofthesamecomponent) couldrunonseparateserversandbeprovidedbydifferentvendorsusingdifferenttechnologies. capacity on a single computer to run several compo- type negotiation based on client capabilities and prefer- nents, some external network calls can be avoided by ences. This allows a change of backend representation using: format at later date without changing client interface. However, to avoid excessive conversions, a likely ap- (cid:1) in-processor inter-processcommunicationif proach is to store representations in the format most supported bythe componentsor often requested byclientsinthedeployedsystem. (cid:1) usingHTTPand otherprotocolsvialocalhost Individualfocusversuspopulationfocus loopbacknetworkingatoperating systemlevelor (cid:1) usingnetwork hardwareloopback support (for Occasionally,discussionsaboutthedesignofadata stor- age that balances the need for rapid EHR access in example when using virtualizationonthesame clinical work focused on individual patients with per- machine forcomponentsrequiring different formance for population-wide queries appear on the operating systems) openEHR wiki [24] and the openEHR technical mailing lists[25]. Storageconsiderationsandgroupingofusecases In clinical work focused on individual patients, the In the backend storage, resources (VERSIONED_OBJECTs, EHR system often either accesses data as big chunks – etc.) are stored in some representation format and poten- documents – in a certain serialization format (for tially decomposed and indexed at some granularity level example openEHR COMPOSITIONs in XML format) or dependingonforeseenmajorusecases. as result listsfrom queries.In this case, itcan bereason- able to store data in chunks (e.g. documents) and if pos- Conversionsandstorageformatchanges sible even in the most frequently requested serialization Converters enable HTTP responses in different repre- format and also to index on fields relevant to frequent sentations (e.g. XML, JSON or serialized Java objects) to retrieval needs. Here, this single EHR access usage is be served from the same stored resource, after media called thesinglerecorduse case. Sundvalletal.BMCMedicalInformaticsandDecisionMaking2013,13:57 Page7of25 http://www.biomedcentral.com/1472-6947/13/57 On the other hand, research activities require URIstocentralresourcesrepresentingEHRRMobjects population-wide queries that access and aggregate When accessing a patient’s EHR, a common first view particular data fields from large amounts of EHRs. Then is a list of patient encounters, a problem list, or other it is reasonable to store these as such small pieces in for summaries; then the clinician fetches details for items example a normalized relational database management of interest. Patient summaries and graphical over- system or to use some data warehouse approach. Here, views can for example be based on AQL queries that thisaggregateEHR accessusageiscalledthemultirecord are used to populate HTML pages with data and hy- use case and the path component of the corresponding perlinks pointing to more detailed information. Ac- URIisprefixedwith/multi/. cess to details can be achieved by pointing directly to Database solutions that have acceptable retrieval speed specific versions in versioned objects using the URI tem- for the single record use case are not necessarily appro- plate path /ehr:{ehr_id}/{object_id}::{creating_system_id}:: priate forthe multi recorduse case[7]. {version_tree_id}. The proposed REST design allows keeping these two use Asexemplifiedabove,centralobjectclassesandassociated case families separate in order to provide maximal freedom methods from the openEHR specification have been for implementers to explore diversified approaches. The assigned URI patterns for direct access instead of via AQL openEHR append-only (or ‘never physically delete’) queries.TheyareexemplifiedinTable1. principle, combined with its clearly time-stamped opera- Since the provided LiU EEE implementation is tions,simplifiesreplicationfrom‘single’to‘multi’databases. hypertext-driven, these detailed URIs are embedded in Ifthemultirecordusecasedatabaseisnotusedforentering the interface pages (like the views in Figure 1) and do data, and if a bit of lag-time behind live single record use not needtobeknownbyendusers suchasclinicians. case EHRs can be accepted, then implementation is rather straightforward. Userinterface Neither the suggested REST approach nor the single The suggested REST architecture does not prescribe versus multi division dictates the use of sharding or double clients based on web browsers or any other specific user databases. Some storage solutions, for example distributed interface technology as long as it can handle the HTTP database frameworks and map-reduce-based approaches protocol and at least the one of the possibly available [22], may prove to be able to handle both the single and representations (e.g. XML, JSON, or serialized Java ob- multi use cases with good performance and thus may not jects) properly; thus Adobe Flash, .NET, or Java (includ- need the use case separation for performance optimization ingthickclients)arepossibleexamples. reasons.However,evenwithaunifieddatabaseforthesingle and multi use cases, there may still be reasons to have the Querying usecasesseparatedatURIlevel: The approach for new queries is to first translate the clinically targeted AQL-queries to other native storage (cid:1) Itmakesiteasiertoprioritize computing resources targeted query languages (such as SQL, SPARQL, or fordirectpatientcarequeries higher thanstatistics XQuery [26]) and then run the query natively in the andresearchqueries database. In the process, a unique SHA-1 checksum of (cid:1) Implementation of security can be simplified and the query, any static query parameters, and the query in optimized if you know beforehand that the query itsoriginalform arealsostored. is only allowed to access a particular patient AQL is targeted specifically towards archetype- record. Then checking of access control lists for constrained systems based on domain-specific reference that particular record combined with pre- or models [10]. Thus it targets the application domain and post-processing of queries can be done without structure rather than the underlying technical storage the need for more advanced query or result format, and the AQL queries should be unaffected by evaluation mechanisms. differences andchanges inunderlying solutions that may (cid:1) Permission to formulate database queries in the vary depending on preferences and use case (like the multi use case only needs to be granted to users singleandmulti use casesdescribed above). with ethical approval for research, etc. Also anonymization of results may be more applicable Querystorage,execution,andHTTPredirectionflow to multi than single use cases. Thequerystorageprovidespermanentfacilitiesforauditing (cid:1) Patient-specific access logging is given by simply queries sent via HTTP POST, that otherwise would not be reading the HTTP log in the single use case. For stored and logged. This has positive side effects that are the multi use case query- or result-analysis is describedintheresultssectionofthispaper. required to see if a specific record has been For the ‘single’ use case queries are POSTed to a URI queried. matching the template:/ehr:{ehr_id}/q/{query-language}/ Sundvalletal.BMCMedicalInformaticsandDecisionMaking2013,13:57 Page8of25 http://www.biomedcentral.com/1472-6947/13/57 Table1URIpatternsforcentralresources for example /ehr:12344321/q/AQL/ If the query syntax URIpathtemplate is invalid, a HTTP error ‘400 Bad Request’ is returned and no query is stored. If the query translation and stor- Examplepath(s) age instead succeeds, a HTTP ’303 See Other’ response Description redirectstoaURIaccording tothepattern: /ehr:{ehr_id}/{object_id}::{creating_system_id}::{version_tree_id} /ehr:{ehr_id}/q/{query-language}/{query-SHA}/HTTP- /ehr:12344321/56d03821-8e89-cca769b7d39e::test2.eee.mi.imt.liu.se::1 compliant clients will then automatically send a GET re- FetchesaVERSION,identifiedbycreating_system_idcombined quest to that URI and the response of the query will be withversion_tree_id,fromtheVERSIONED_OBJECTidentifiedby returned. When web browsers automate this redirection, object_id.URIscontainingfragments(aftera#sign)willhavethe sameeffect.Thus/ehr:12344321/56d03821-8e89-cca769b7d39e:: the new URL may not be accessible to JavaScript-based test2.eee.mi.imt.liu.se::1#pathwillalsofetchtheentireVERSION user interface code. Thus it is recommend that server fromtheserverandlettheclientdealwiththefragment. implementations also add the URL containing the /ehr:{ehr_id}/{object_id}::{creating_system_id}::{version_tree_id}/… {query-SHA} pattern in the more script-accessible ‘Con- /ehr:12344321/56d03821-8e89-cca769b7d39e::test2.eee.mi.imt.liu.se::1/ tent-Location’HTTPheader field. content[openEHR-EHR-SECTION.vital_signs.v1]/items[openEHR-EHR- Most databases support stored queries called with OBSERVATION.blood_pressure.v1]/data/ variable parameters, and AQL also supports variable pa- FetchesonlythepartofaVERSIONspecifiedbythepathafter rameters in queries. For performance reasons, any query theid.Thiscanbeusedbyapplicationsthatforexampledueto privacyreasonsonlywanttoretrievepartofaversion. supposed to be reused should use parameters for vari- /ehr:{ehr_id}/{object_id}/{command} able parts like ehr_id, date ranges, etc., so that optimiza- tionscanbeperformed. /ehr:12344321/56d03821-8e89-cca769b7d3/all_version_ids Any parameters starting with ‘_’ (underscore) that are Fetchesobjectlistsormetadataofdifferentkindsdependingon POSTed with the original query are interpreted as being command,seeopenEHRspecifications[8]fordetails. dynamic and are – after removing the underscore – /ehr:{ehr_id}/{object_id}@{versionLookup} converted to URI-encoded parameters and included in /ehr:12344321/56d03821-8e89-cca769b7d39@latest_version the redirection URI (together with other possibly pre- /ehr:12344321/56d03821-8e89-cca769b7d39@2005-08-02T04:30:00 existing URI query parameters). If the ‘debug’ parameter Fetchesversionbasedonlookupcommand,forexamplethe is present and set to ‘true’, the translated query will be versionthatwascurrentataspecifictime. returned as text to the client instead of being executed. /multi/ehr:{ehr_id}/{object_id}… Thus the redirected GET request logged in the standard Ad-hocqueriesinthe‘multi’usecasecanalsoproducelistsor HTTP log contains the unique query-SHA and the dy- reports,sometimescontainingdetailhyperlinkspointingto namicparameter valuesusedinthe call.The‘query’field versionedobjects.Byprependingmultitothepathsuchobjects canoptionallybefetchedfrom,andthusloggedby,themulti containing the original query, additional POSTed static databaseinsteadofthesingledatabase,ifdesired. parameters (and metadata about who created the query /ehr:{ehr_id}/contributions/ when, etc.) can later be audited and inspected by calling /ehr:12344321/contributions/?start=1&end=5&descending=true /ehr:{ehr_id}/q/{query-language}/{query-SHA}/info/. Static parameters can be used for example to send extra HTTPGETreturnsallcontributionsfortheEHRidentifiedby ehr_id,pagedbyvariablesstart(default1)andend(default20) configuration parameters to query translators. Static thedefaultorderingisdescending(mostrecentcontribution parameters are stored together with the query and listedfirst). included (sorted in alphabetical order as a JSON text HTTPPOSTisusedtocommitacontribution;allnewand string) in the calculated SHA-1 checksum. Example changedVERSIONsshouldbeincludedinthebodyofthePOST. usages of both dynamic and static parameters are illus- ThecurrentLiUEEEimplementationacceptseitherXMLor serializedJavaobjects.Someweb-basedapplicationswillinstead trated intheLiUEEEimplementation (AppendixA). ofPOSTingtothisURIprefertousethe/cb/{committer_id}/ For the ‘multi’ use case new queries are POSTed to {ehr_id}/{cb-id}/commit/command(describedinTable2)that URIs like /multi/ehr/q/{query-language}/ and redirected callsthesameverificationandstoragemechanismsinternally. to/multi/ehr/q/{query-language}/{query-SHA}/. /ehr:{ehr_id}/contributions/{contribution_id}/ /ehr:12344321/contributions/7a11c126-e1af-4022-9c36-f046693bb237/ Responseformatsandhybridqueries Fetchesthecontributionidentifiedbycontribution_id Neither the openEHR specifications nor the AQL descrip- Table1showsasubsetofURItemplatepatternsidentifyingcentralopenEHR tion [9] specifies a return format. We have implemented, objects.Abackgroundtoversioningandcommandscanbefoundinthe openEHRCommonInformationModelspecification[8].Theprefix‘ehr:’uses and are thus suggesting one, for XML-formatted query thecolonsignsothatinterfacedesignersmoreeasilycanmakecallstoEHR results. The corresponding XML Schema is available in the URIsasdescribedinSection11.3oftheopenEHRArchitectureOverview[5].By LiUEEEimplementation. prependingaslash(/)toanEHRURIfoundwithindata,itwillbecomeafully functionalHTTPlinktothedesiredEHRobject. An interesting option for those queries that don’t need to be standardized and reused between systems, is to
Description: