ebook img

Relational Databases. State of the Art Report PDF

439 Pages·1986·29 MB·English
by  D. Bell
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Relational Databases. State of the Art Report

Relational databases State of the Art Report 14:5 Pergamon Infotech Limited A member of the Pergamon Group Oxford New York Toronto Sydney Beijing Frankfurt Published by Pergamon Infotech Limited Berkshire House Queen Street Maidenhead Berkshire England SL6 INF. Telephone: 0628 39101 International + 44 628 39101 Telex: 847319 (Answerback INFO G) Printed by A Wheaton & Company Limited Exeter Devonshire England. UDC 681.3 Dewey 658.505 ISBN 0 08 034094 6 © Pergamon Infotech Limited, 1986 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photographic, or other- wise, without the prior permission of the copy- right owner. It should be noted that the copyright for the Report as a whole belongs to Pergamon Infotech Ltd. The copyright for individual contributions belongs to the authors themselves. IV Foreword For more than a decade developers of computerised information systems have accepted the database approach, despite the shortcomings of the available software and design tools for database management. They have collectively 'voted with their feet' by investing in and installing large number of databases which are now indispensable to many organisations throughout the world. Yet the inadequacies of the Database Management Systems (DBMSs) available have not been trivial. However, the advent of commercial relational database systems alongside other technological and methodological advances in the latter half of that decade or so has greatly improved this situation. There are currently great expectations, on the part of information systems developers and users alike, that the promise of these systems is at last near fulfilment. The contributions of the relational database systems to these hopes include the following: • Enhanced independency of data and programs • Provision of high-level interfaces to data collections • Understanding of the properties of data. Some, but by no means all, DBMSs which are tagged as 'relational' go a long way towards delivering truly relational capability and offer the attendant advantages. However, the history of the Relational Data Model (RDM) really started nearly 10 years before fully commercial systems appeared, and the embryonic relational ideas appeared even before E F Codd's famous papers in the early 1970s. A major advantage of the relational model is the basis it provides for theoretical investigations of databases. It has underpinned nearly all research undertaken in databases since Codd's early papers and a wealth of significant results has accumulated. Many of these are of practical as well as theoretical interest, as for example in the area of schema design, while all are of interest per se. Increasingly, extensions to the basic RDM have appeared in the literature and these, taken with the current endeavours aimed at its integration with Artificial Intelligence (AI) techniques, hold much promise for future data handling products. The design of a relational database is, by common consent, a very challenging task. A conceptual model of the aspects of the real world of interest must be produced and at lower design levels or layers this becomes a logical data model, and later a physical data model. The performance of an installed database system is of great importance to its developers and users at all levels. At all design levels excellent methods are being developed to assist the designer and many of these fit well within the newly named discipline of information engineering. Perhaps more attention to performance is still required in many of the better known methods. Several of the available commercial systems interface with information engine- ering tools. Distributed processing and the integration of geographically distributed databases have increased in importance over the lifetime of the RDM. There are several methods available for conversion or inte- gration of a particular database with a pre-existing, possibly heterogeneous, database, perhaps the most vii comprehensive being the multidatabase approach. Products offering multidatabase capability are beginning to appear. These innovations and others, such as the advent of database machines claimed to offer attractive alternatives to software implementation of database system components or database support for multimedia information systems, mean that relational database system developers and users will have many stimulating challenges in the next five years. LtM D A Bell: Editor viii Publisher's note This Report is divided into three parts: 1 Invited Papers. 2 Analysis. 3 Bibliography. The Invited Papers in this State of the Art Report examine various aspects of relational databases. If a paper cites references they are given at the end of the Invited Papers section, numbered in the range 1-99 but prefixed with the first three letters of the Invited Paper author's name. The Analysis has the following functions: 1 Assesses the major advances in relational databases. 2 Provides a balanced analysis of the state of the art in relational databases. The Analysis is constructed by the editor of the Report to provide a balanced and comprehensive view of the latest developments in relational databases. The editor's personal analysis of the subject is supplemented by quotations from the Invited Papers, written by leading authorities on the subject. The following editorial conventions are used throughout the Analysis: 1 Material in Times Roman (this typeface) is written by the editor. 2 Material in Times Italic (this typeface) is contributed by the person or publication whose name precedes it. The contributor's name is set in Times Italic. Numbers in parentheses in the ranges 001-299 or 300-399 following the name refer to the original source as specified in the Analysis references or the Bibliography, respectively, which both follow the Analysis. References within the text are numbered in the same way. A contributor's name without a reference refers to an Invited Paper published in this Report. 3 The quotations in the Analysis are arranged at the discretion of the editor to bring out key issues. Three or four dots within a single quotation indicate that a portion of the original text has been removed by the editor to improve clarity. The Bibliography is a specially selected compilation of the most important published material on the subject of relational databases. Each key item in the literature is reviewed and annotated to assist in selecting the required information. IX 1: INGRES/Distributed Database—meeting business needs R J Ballard Relational Technology London UK The author examines changes which are taking place in computing, telecommuni- cations and management, and the effect these have on the needs for distributed processing and distributed database. He looks at the theory and practice of distri- buted systems, using the example of INGRES from Relational Technology as illustration in the following account. The paper goes on to consider research in the fields of distributed processing and distri- buted database and to discuss products which are already commercially available from Relational Technology and also the plans for future products from the company. R J Ballard 1986 3 R J Ballard Ron Ballard is with the Tech- nical Services group at Re- lational Technology Interna- tional Ltd. He came to Re- lational Technology after hav- ing spent 12 years in the com- puter industry. The bulk of his career has involved developing and supporting database management systems. Previously Mr Ballard was with Savant, where he was South of England project manager responsible for development and support of turn- key installations using relational database and Fourth Generation application development tools. Before joining Savant, he was a project leader with Cincom Systems (UK) for nearly five years, developing database and data dictionary systems. After starting his career in King's College Hospital computer centre and Drake and Scull Engineer- ing, Mr Ballard formed his own analysis and programming company working for Systemsolve, the British Library, the Chloride Group Ltd and Oakland Software Ltd. At the British Library, he worked on development of the 'Merlin' biblio- graphical database system. Mr Ballard has a BSc (Hons) degree from the University of Manchester. 4 INGRES/Distributed Database- meeting business needs Current hardware developments A typical corporate data processing (DP) system today runs on a single powerful mainframe computer and supports many users. Often many of these users are at remote locations and access the central computer using a telephone link. Concentrating all processing on a single large computer once appeared to be the most effective way to provide this kind of service. For years the best price/performance ratios could be gained only from the most powerful computers available. This situation has now changed. Chip technology has advanced to maturity. As technology has been pushed further and further for extra processing speed, economies of scale have been eclipsed by the law of diminishing returns. Raw processing power (measured in million/instr/sec) can now be acquired more cheaply by buying several small· processors than by buying one large processor. The benefits of this can only be realised, however, if all the small processors are used in parallel. Some machines are being built to achieve speed for a single user by executing parts of a single program in parallel. This currently requires programming techniques which are beyond the skill of most pro- grammers and is most appropriate for 'number-crunching' in scientific, technical and engineering applications. Distributed processing and distributed database offer alternative methods of exploiting parallel processing, which are more accessible to commercial data processing. Current software developments Software has also been changing. The long-standing shortage of skilled professional application developers has driven the search for methods of improving productivity. Relational databases and Fourth Generation languages have been the most effective tools in the battle against the software backlog. In the absence of a generally accepted definition there has been a lot of discussion about just what a Fourth Generation language is. For the purposes of this paper the author proposes the following definitions: • A Fourth Generation language significantly increases productivity over Third Generation languages (COBOL, FORTRAN, Pascal, etc) • A Fourth Generation language has database access language built in • A Fourth Generation language has forms manipulation built in, with a separate forms definition and editing tool • A Fourth Generation language provides database and forms independence through a fully integrated data dictionary. 5 Software productivity tools have traded application developers' time for computer processing time. More of the work of application development is being done by the computer system, and this fuels the demand for increased performance. Another aspect of the changes in the application development environment is that the very largest, most powerful computers generally provide the worst, least productive, application development environment. Perhaps this is because the manufacturers of the largest computers have a market in organisations which have large and very long-term investments in their computer systems. The result of this is that many companies, both the customers and the suppliers, are trapped in the pursuit of compatibility with older systems. This has locked them into archaic software technology. There still exists a vast army of programmers, working in commercial environments, building systems using the tools of the 1960s. Second Generation languages (Assembler), basic file systems handled by the programmer with data definition and screen design embedded, like pebbles in concrete, in the application programs are all too common. The use of distributed database can help organisations to move forward in two ways. First, a distributed database offers the possibility of larger databases and higher performance, increasing the choice of hardware to support large systems. Secondly, a distributed database which allows access to databases managed by other vendors' software, makes a gradual transition to a more modern software environment practical. Current telecommunications developments Compared with the dramatic fall in hardware costs and increases in programmer productivity through the use of new tools, telecommunication costs have been roughly static over a period of time. Demand, however, has increased dramatically. Technological advances (digital working, optical fibres, automated electronic exchanges, and so on) have allowed the suppliers to follow this demand, but have not reduced costs, or even improved price/performance ratios significantly. There are few signs that indicate any change in these trends. The implication of these developments for the future of geographically distributed computer systems is clear. Reductions in communications costs will only occur as a result of a decrease in communica- tions traffic. An emerging trend will be the use of increased computer processing to reduce communica- tions traffic. In conventional large DP organisations information flows between the central computer and remote terminals. Much of the data, however, will usually be of local interest only. A distributed database with local data kept on a local computer system, but with the possibility of accessing data from any other local site or the central site, could cut down network traffic enormously. Figures would vary widely from site to site, but it is not unreasonable to suppose that, in a typical organisation, 80 per cent of total data needs could be satisfied locally, with only 20 per cent requiring access to another node. To be practical, a distributed database system must make access to the data independent of the site where it is stored. The data distribution must be transparent to the user, while allowing the distributed database administrator freedom to run the system for minimum network traffic and maximum overall throughput. Recent developments in business management The last few years have seen the publication of many studies describing the management of successful companies. In the increasingly competitive business world of the 1980s, pursuing excellence has become a necessity for any business which wishes to survive and grow, rather than a concern only of the most ambitious. Ironically, it has become apparent that the growing organisation must try to preserve many of the characteristics of a small business if it is to remain competitive. The vast militaristic bureaucracy, with centralised, many-levelled reporting and decision-making hierarchies is no longer a realistic model for success. 6 Ballard Successful large companies go to great lengths to preserve an energetic, entrepreneurial culture in all kinds of ways. A manifestation of this is the separation of the company into smaller autonomous units—units which can behave more like small companies. At the same time, these small units need to act in harmony, to further the objectives of the organisation as a whole. The information processing needs of successful, medium-to-large companies reflect this paradox. Local branches of the organisation must have control over their own information. At the same time, if they are to act effectively as a part of the whole, they must conform to some company-wide standards and at least some of their data must be accessible company-wide. Typically, company-wide data will be used mostly for decision support. A distributed database offers a solution to this paradox. The core information in the organisation's database can be defined and controlled centrally. Local branches will keep data relevant to their own operations locally. For the organisation to function effectively as a whole, it will often be necessary for one branch to access the central core information, and perhaps other branches' data in a controlled way. The cen- tral office will also need access to branch data for summary information about the performance of the whole organisation. The system can work effectively only if such data transfers can be achieved efficiently and easily. Recent developments in DP management DP management has been affected by the changes in general management thinking in recent years and by changes in the other areas described above. The changes in hardware, software and telecommunication costs have changed the rules of system development, but, in the main, central DP departments still develop centralised systems which are imposed with varying degrees of success (both in functionality and timeliness) on the user community. The biggest impact on DP departments has come about as a result of the falling cost of hardware (and many software packages) which has reached a level where purchases are within the discretionary budgets of department heads. Many micro and minicomputer systems offer a degree of ease of use and price/performance which suggests that departmental, or even individual, computing is a real, cost-justifiable productivity booster. The hidden cost has been loss of control by the DP department, with a resulting uncontrolled fragmentation and duplication of the organisation's data. Distributed database and distributed processing can help to bring the situation under control, without taking away from users the independence and flexibility they have come to expect from local computing facilities. Distributed processing and distributed database It is important to distinguish distributed processing from distributed database. Both are becoming key technologies in the modern organisation, and they can work together harmoniously to increase the effectiveness of individuals and the competitiveness of the organisation. Distributed processing gives each user or group of users their own computer processors, with many such processors accessing one database. The data resides at a central computer. A distributed database has data stored on many computers, but this data can be accessed as one consistent, integrated database from many sites. Both distributed processing and distributed database require a reliable, effective networking system. The information resource management approach The goal of information resource management is to take the concept of distributed database one stage further, to allow access to many kinds of Database Management Systems (DBMSs), and view the whole collection of databases supported by different vendors' software as one integrated database. There are costs in doing this. A system which is based on more than one DBMS will be inherently more complex, and therefore more difficult to manage and maintain than a system based on a single DBMS. 7

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.