ebook img

Data Lake Architecture: Designing the Data Lake and Avoiding the Garbage Dump PDF

209 Pages·2016·16.88 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Data Lake Architecture: Designing the Data Lake and Avoiding the Garbage Dump

Designing the Data Lake and Avoiding the Garbage Dump first edition Bill Inmon Published by: 2 Lindsley Road Basking Ridge, NJ 07920 USA https://www.TechnicsPub.com Cover design by John Fiorentino Edited by R A Peters All rights reserved. No part of this book may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording or by any information storage and retrieval system, without written permission from the publisher, except for the inclusion of brief quotations in a review. The author and publisher have taken care in the preparation of this book, but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the information or programs contained herein. All trade and product names are trademarks, registered trademarks, or service marks of their respective companies, and are the property of their respective holders and should be treated as such. Copyright © 2016 by Bill Inmon ISBN, print ed. 9781634621175 ISBN, Kindle ed. 9781634621182 ISBN, ePub ed. 9781634621199 ISBN, PDF ed. 9781634621205 First Printing 2016 Library of Congress Control Number: 2016935768 To Dr. Sylvia Sydow, you mean the world to me Contents at a Glance Chapter 1: Data Lakes Chapter 2: Transforming the Data Lake Chapter 3: Inside the Data Lake Chapter 4: Data Ponds Chapter 5: Generic Structure of the Data Pond Chapter 6: Analog Data Pond Chapter 7: Application Data Pond Chapter 8: Textual Data Pond Chapter 9: Comparing the Ponds Chapter 10: Using the Infrastructure Chapter 11: Search and Analysis Chapter 12: Business Value in the Data Ponds Chapter 13: Additional Topics Chapter 14: Analytical and Integration Tools Chapter 15: Archiving Data Ponds Glossary References Index Table of Contents Introduction Chapter 1 Data Lakes Enter Big Data Enter the Data Lake “One Way” Data Lake In Summary Chapter 2 Transforming the Data Lake Metadata Integration Mapping Context Metaprocess Data Scientist General Usability In Summary Chapter 3 Inside the Data Lake Analog Data Application Data Textual Data Another Perspective In Summary Chapter 4 Data Ponds Conditioning Data Raw Data Pond Analog Data Pond Application Data Pond Textual Data Pond Data Passing Directly Into the Data Ponds Archival Data Pond In Summary Chapter 5 Generic Structure of the Data Pond Pond Descriptor Pond Target Pond Data Pond Metadata Pond Metaprocess Pond Transformation Criteria In Summary Chapter 6 Analog Data Pond Analog Data Issues Data Descriptor Capturing Raw Data/Transforming Raw Data Transforming/Conditioning Raw Analog Data Data Excision Clustering Data Data Relationships Probability of Future Usage Outliers Specialized Ad Hoc Analysis In Summary Chapter 7 Application Data Pond DNA of Data Descriptors Standard Database Format Basic Organization of Data Integration of Data Data Model Necessity of Integration Pointing From one Application to the Next Intersecting Applications Subsets of Data in the Application Data Pond In Summary Chapter 8 Textual Data Pond Uniform Data and the Computer Valuable Text

Description:
Organizations invest incredible amounts of time and money obtaining and then storing big data in data stores called data lakes. But how many of these organizations can actually get the data back out in a useable form? Very few can turn the data lake into an information gold mine. Most wind up with g
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.