Lecture Notes in Statistics Vol. 1: R. A. Fisher: An Appreciation. Edited by S. E. Fienberg and D. V. Hinkley. xi, 208 pages, 1980. Vol. 2: Mathematical Statistics and Probability Theory. Proceedings 1978. Edited by W. Klonecki, A. Kozek, and J. Rosinski. xxiv, 373 pages, 1980. Vol. 3: B. D. Spencer, Benefit-Cost Analysis of Data Used to Allocate Funds. viii, 296 pages, 1980. Springer Series in Statistics L. A. Goodman and W. H. Kruskal, Measures of Association for Cross Classi fications. x, 146 pages, 1979. J. O. Berger, Statistical Decision Theory: Foundations, Concepts, and Methods. vi x, 420 pages, 1980. Lecture Notes in Statistics Edited by S. Fienberg, J. Gani, J. Kiefer, and K. Krickeberg 3 Bruce D. Spencer Benefit-Cost Analysis of Data Used to Allocate Funds Springer-Verlag New York Heidelberg Berlin Professor Bruce D. Spencer Northwestern University The School of Education Evanston, Illinois 60201/USA AMS Subject Classification: 62P25 Library of Congress Cataloging in Publication Data Spencer, Bruce D Benefit-cost analysis of data used to allocate funds. (Lecture notes in statistics; 3) Based on the author's thesis, Yale. Bibliography: p. 1. Revenue sharing-United States-Cost effectiveness. I. Title. II. Series. HJ275.S59 336.1 '85 80-19589 ISBN-13: 978-0-387-90511-2 e-ISBN-13: 978-1-4612-6099-6 001: 10.10071978-1-4612-6099-6 All rights reserved. No part of this book may be translated or reproduced in any form without written permission from Springer·Verlag. © 1980 by Springer-Verlag New York Inc. Softcover reprint of the hardcover 1s t edition 1980 9 8 7 654 321 Preface This monograph treats the question of determining how much to spend for the collection and analysis of public data. This difficult problem for government statisticians and policy-makers is likely to become even more pressing in the near future. The approach taken here is to estimate and compare the benefits and costs of alternative data programs. Since data are used in many ways, the benefits are hard to measure. The strategy I have adopted focuses on use of data to determine fund allocations, particularly in the General Revenue Sharing program. General Revenue Sharing is one of the largest allocation programs in the United States. That errors in population counts and other data cause sizable errors in allocation has been much publicized. Here we analyze whether the accuracy of the 1970 census of population and other data used by General Revenue Sharing should be improved. Of course it is too late to change the 1970 census program, but the method and techniques of analysis will apply to future data programs. In partic ular, benefit-cost analyses such as this are necessary for informed decisions about whether the expense of statistical programs is justi fied or not. For example, although a law authorizing a mid-decade census was enacted in 1976, there exists great doubt whether funds will be provided so a census can take place in 1985. (The President's Budget for 1981 allows no money for the mid-decade census, despite the Census Bureau's request for $1.9 million for planning purposes.) Is this decision in the national interest? Explicit benefit-costs analyses are severely needed. Much of the present work is technical. The reader interested solely in policy implications is encouraged to read chapter 0 lightly and then proceed directly to chapter 7. For a thorough but non-technical course, I suggest chapter 0; chapter 1, first four sections at least; chapter 3 sections 1 and 2, then other sections as interest dictates; the first section of chapter 4; the first section or two of chapter 5; then chapters 6 and 7, skipping proofs all the while. The manuscript iii -iv- is heavily cross-referenced, so the reader should feel free to skip around, since work that is used later will be referenced at that time. In addition, guideposts placed throughout the work indicate parts that can be skipped on a first reading. An overview of the work is given in chapter O. Acknowledgments This monograph has arisen from my doctoral dissertation for the Department of Statistics at Yale University. I wish to express my indebtedness to I. Richard Savage, who led me to the problem and provided continuing guidance and stimulation. I am grateful also to William Kruska1, Stephen Dresch, Francis Anscombe, and Thomas Jabine for reading the manuscript and for helpful comments. This research was largely carried out while I was at Yale, where I was partly supported by university fellowships and by a National Science Foundation grant (SOC 75-15614). I wrote the final draft while at the Committee on National Statistics of the National Research Council - National Academy of Sciences and while at Northwestern Uni versity. I am grateful to colleagues of all three places for their encouragement and support. Evanston, Illinois B. D. S. March 1980 TABLE OF CONTENTS Preface iii List of Tables vii Chapter 0 Introduction 1 Chapter 1 Loss Function and Benefit Measurement 9 §1.0 Introduction 9 1.1 Utility and Social Welfare 12 1.2 Equity 18 1.3 A Simple Loss Function for Errors in Allocation 25 1.4 Estimating the Parameters of the Loss Function L(f.e} for GRS 31 1.5 Fisher-Consistency and Other Properties of the Loss Function L(t.e} 33 1.6 Hore General Loss Functions: ~(~,~) , Lw(~'~} 41 1.7 Exponential Loss Functions 47 Chapter 2 The Delta Method 53 §2.0 Outline 53 2.1 Notation 53 2.2 Description of the Del ta Meth cxl 55 2.3 ApplicabilIty of the Delta Method to GRS 60 2.4 Application of the Delta Method to Calculating Expected Loss 63 Chapter 3 Data Used in General Revenue Sharing 67 §3.l Introduction 67 3.2 Population (state level) 76 -v- -vi- 3.3 Urbanized Population 83 3.4 Population (subs tate level) 88 3.5 Total Money Income. Per Capita Income (state level) 97 3.6 Per Capita Income (substate level) 103 3.7 Personal Income 117 3.8 Net State and Local Taxes 120 3.9 State Individual Income Taxes 123 3.10 Federal Individual Income Tax Liabilities 126 3.11 Income Tax Amount (state level) 128 3.12 Adjusted Taxes (subs tate level) 132 3.13 Intergovernmental Transfers (substate level) 136 3.14 Interrelationships Among State- level Data Elements 138 3.15 Interrelationships Among Substate Data Elements 143 3.16 Outline-Symbol Glossary 147 Chapter 4 Interstate Allocation in GRS 151 §4.l Introduction 151 4.2 Notation 154 4.3 Conclusions 158 4.4 Assumptions and Degrees of Approximation 166 4.5 Derivations and Proofs 169 Chapter 5 Intrastate Allocations in GRS 177 §5.l Overview 177 5.2 Determination of Substate Allocations 177 5.3 Notation 190 5.4 Assumptions and Degrees of Approximations 197 5.5 Conclusims 199 -vii- Chapter 6 Computations and Analyses 207 §6.l Introduction 207 6.2 Errors in Allocation to States 210 6.3 Substate Errors in Allocation 224 6.4 Benefit Analysis 239 6.5 Summary of Findings 247 Chapter 7 Policy Perspectives and Recommendations 250 §7.0 Introduction and Summary 250 7; 1 Redundancy of Allocation Formulas 252 7.2 Data Burdens Imposed by Legislation 254 7.3 Minimizing Large Errors in Allocation 255 7.4 Uniform Biases, Non-uniform Variances 256 7.5 Adjusting for Undercoverage or Other Suspected Biases 257 7.6 Improving the Coverage of the Decennial Census 260 7.7 State Errors vs. Substate Errors 262 7.8 Allocation Formulas as Approximations 264 7.9 Other Uses of Data 265 Appendix A Tables of Biases in Data 266 Appendix B Determination of General Revenue Sharing Allocations 271 Technical Appendix 275 Bibliography 284 LIST OF TABLES 3.1 Target Variables - State Population 77 3.2 Target Variables - Subs tate Population 90 3.3 Target Variables - State Money Income, State Per Capita Income 98 3.4 Target Variables - Personal Income 118 3.5 Target Variables - Net State and Local Taxes 121 3.6 Target Variables - State Individual Income Taxes 124 3.7 Target Variables - Federal Individual Income Tax Liabilities 126 3.8 Target Variables - Subs tate Adjusted Taxes 133 4.1 Locator for Description of Data 154 5.0 Substate Units Constrained in New Jersey, EP 1 189 5.1 Data Element References 190 5.2 Subs tate Allocation Formulas 193 6.0 Total Allocations for GRS, EP's 1-11 209 6.1 Error Models Used for Sensitivity Analysis of State Allocations 212 6.2 Moments of Errors in GRS Allocations to States, EP 1 215 6.3 Moments of Errors in GRS Allocations to States, EP 6 216 6.4 Sums of Mean Absolute Errors in GRS Allocations to States, EP 1 217 6.5 Sums of Mean Absolute Errors in GRS Allocations to States, EP's 3,4,6 218 6.6 GRS Allocations to States Based on 1970 Census Reported Data and 1970 Census Data Corrected by the Basic Synthetic Method 222 6.7 GRS Allocations to States Based on 1970 Census Reported Data and 1970 Census Data Corrected by a Modified Synthetic Method 223 6.8 Moments of Errors in New Jersey County-Area Shares, EP 1 225 6.9 Moments of Errors in New Jetsey County-Area Shares, EP 6 228 6.10 Changes in GRS Allocations to County Areas in California After Adjusting for Underenumeration 229 6.11 Moments of Errors in Allocation to Places in Essex County, EP 1 235 6.12 Moments of Errors in Allocation to Places in Essex County, EP 6 236 6.13 Sums of Mean Absolute Errors in Allocation to County Areas in New Jersey 237 6.14 Sums of Mean Absolute Errors in Allocation to Places in Essex County 238 A1 Estimates of State-level Relative Biases in Population, Urbanized Population, and Money Income 267 A2 Estimates of Undercoverage Rates in New Jersey Counties 269 A3 Estimates of Undercover age Rates in Places and Municipalities in Essex County, New Jersey 270 Chapter O. Introduction In 1970 the U.S. census cost more than $220 million and failed to count an estimated 5.3 million people. The 1980 census will cost a billion dollars and the size of the undercount, or the number of persons missed, is also certain to be in the millions. If the undercount were evenly distributed among geographic regions and population subgroups, matters would not be as serious as they are. In fact, the undercount is spread unevenly, with certain minority groups and geographic regions undercounted more than others. Census data are used in more than 100 formulas that allocate over $50 billion annually in federal aid. A greater than average undercount for a group or region can deny it its fair share of these funds. Since census data also determine apportionment of seats in the House of Representatives, under count can cause a state to gain or lose a seat. National ethnic organizations worry that under counts diminish their political influence by failing to indicate their true numbers. Errors in census data clearly have significant impact, but how much should be spent to get good data? The 1970 census missed an estimated 2.5 percent of the population. Would it have been worthwhile to spend an extra $10 million to reduce the undercount? An extra $100 million? Similar questions, not just about censuses but about other data collection activities as well, are relevant to statistical agencies in general, already hard pressed by increasing demands for a multitude of data. And the present conflict between demands and scarce resources can be expected to worsen as the demands for d~ta outstrip the resources for obtaining and providing the data. The preferred way to cope with this problem is rational -1-
Description: