ebook img

Health Care Facility Choice and User Fee Abolition PDF

37 Pages·2013·1.25 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Health Care Facility Choice and User Fee Abolition

University of Pretoria Department of Economics Working Paper Series Health Care Facility Choice and User Fee Abolition: Regression Discontinuity in a Multinomial Choice Setting Steven F. Koch University of Pretoria Jeffrey S. Racine McMaster University Working Paper: 2013-53 September 2013 __________________________________________________________ Department of Economics University of Pretoria 0002, Pretoria South Africa Tel: +27 12 420 2413 Health Care Facility Choice and User Fee Abolition: Regression Discontinuity in a Multinomial Choice Setting ∗ Steven F. Koch†and Je↵rey S. Racine‡ September 6, 2013 Abstract We apply parametric and nonparametric regression discontinuity methodology within a multinomial choice setting to examine the impact of public health care user fee abolition on health facility choice using data from South Africa. The nonparametric model is found to outperform the parametric model both in- and out-of-sample, while also delivering more plausible estimates of the impact of user fee abolition (i.e. the ‘treatment e↵ect’). In the parametric framework, treatment e↵ects were relatively constant – around 7% – and that increase was drawn equally from both home care and private care groups. On the other hand, in the nonpara- metric framework treatment e↵ects were largest for the least well-o↵ (also around 7%)butfellforthemostwell-o↵. Moreplausibly,thatincreasewasdrawnprimarily fromthehomecaregroup,suggestingthatthepolicyfavouredthoseleastwell-o↵as more of these children received at least some minimum level of professional health care after the policy was implemented. Regarding the most well-o↵, despite having access to free public health care, children were still far more likely to receive health care at private facilities than at public facilities, which is also more plausible in South Africa’s two-tier health sector. Koch would like to thank seminar participants at the University of the Free State and Emory Uni- ver∗sity,aswellasparticipantsattheworkshopfortheMicroeconometricAnalysisofSouthAfricanData and Economic Research Southern Africa’s Public Economics Workshop for their helpful comments and suggestions. Koch would also like to thank Dane Kennedy and the Centre for High Performance Com- puting (CHPC:www.chpc.ac.za) for their support. Racine would like to thank the Shared Hierarchical AcademicResearchComputingNetwork(SHARCNET:www.sharcnet.ca)fortheirongoingsupportand togratefullyacknowledgefinancialsupportfromtheNaturalSciencesandEngineeringResearchCouncil of Canada (NSERC:www.nserc.ca) and from the Social Sciences and Humanities Research Council of Canada(SSHRC:www.sshrc.ca),andalsoacknowledgeJohnKealeywhosefeedbackledtomanyexpos- itory improvements in the manuscript. All remaining errors are the sole responsibility of the authors. †Department of Economics, University of Pretoria, Pretoria, Republic of South Africa, [email protected] ‡DepartmentofEconomics&GraduatePrograminStatistics,DepartmentofMathematics,McMas- ter University, Hamilton, Ontario, Canada L8S 4M4, [email protected] 1 Introduction Although Thistlethwaite & Campbell’s (1960) regression discontinuity (RD) methodol- ogy did not, initially, receive much attention in economics, RD applications have be- come increasingly prevalent; see the recent reviews by van der Klaauw (2008) and Lee & Lemieux (2010) by way of illustration. RD is likely to underpin empirical assessment of policy impacts for the foreseeable future, particularly given the recent authoritative guide by Imbens & Lemieux (2008) that facilitates its implementation. As highlighted in the aforementioned reviews and guide, part of RD’s appeal lies in delivering visual summaries of policy e↵ects (‘treatment e↵ects’) that are immediately accessible to the practitioner and policy analyst alike. In many cases, it is possible to instantly sum- marize and communicate changes in average outcomes at the RD threshold, even when that threshold is fuzzy. RD analyses tend to focus on average treatment e↵ects, typically through the appli- cation of linear parametric ordinary least squares. If the threshold is fuzzy, however, the construction of local average treatment e↵ects (Imbens & Angrist 1994) on compliers via linear parametric two-stage least squares is often adopted. Linear regression models remain popular in this setting, even when the outcome data is discrete.1 An early ex- ample of an application of the so-called linear probability model (i.e. linear regression with binary discrete outcomes) can be found in DiNardo & Lee (2002), but more recent examples abound; see by way of illustration Silles (2009), Lindeboom, Llena-Nozal & van der Klaauw (2009), Kerr, Lerner & Schoar (2010) and Arcand & Wouabe (2010). With respect to multinomial discrete outcomes, both Lalive (2008) and Schmieder, von Wachter&Bender(2012)treatdiscretedurationdataasifitwerecontinuousinalinear parametric regression setting. Sometimes discrete outcome data is treated as if it were continuous via the construction of cell means; see Lemieux & Milligan (2008) and Car- penter & Dobkin (2009) by way of illustration;2 see also Zuckerman, Lee, Wutoh, Xue 1Regression models estimate a conditional mean, E Y x , as opposed to a conditional probability, Pr Y yx . For the binary (0/1) outcome case these objects coincide since E Y x 0 Pr Y 0x 1 Pr Y 1x Pr Y 1x (this does not hold(fo￿r)multinomial discrete outcomes). But with line(ar r=egr￿es)sion and a binary outcome, the estimated conditional mean can lie ou(tsi￿de) =0,1×, the(reb=y v￿io)la+ting×bas(ic a=xio￿m)s=of p(rob=abi￿lit)y. It is often avoided for this (and other) reasons; see Aldrich & Nelson (1995) for details. [ ] 2Inadditiontoparametriclinearmodels,Lemieux&Milligan(2008)andCarpenter&Dobkin(2009) 2 & Stuart (2006) who apply linear least squares methods to discrete count data. Other times, multinomialdiscreteoutcomedataisre-categorized, asinCoe&Zamarro(2011), into binary outcomes such that linear probability models can be applied (although they mention considering probit models for the binary outcomes, they do not report the re- sults). On the other hand, a nonparametric approach might proceed by treating and modeling variables according to their natural domain, i.e. ‘nominal’, ‘ordinal’, or ‘nu- meric’ (‘discrete’ or ‘continuous’), and then modeling a conditional probability directly for nominal/ordinal/discrete outcomes rather than a conditional mean. As underscored above, the use of linear regression with discrete, count, and multi- nomial outcomes remains popular.3 Although linear models are easy to use and easy to interpret, their widespread adoption and unquestioning use is cause for concern, par- ticularly when the outcome is binary or multinomial. One well-known problem is that it is possible for the predicted outcome for any observation, on either side of the RD threshold, to lie outside of the unit interval thereby violating basic axioms of probabil- ity. Although estimates that lie outside the unit interval may be uncommon, none of the previously mentioned linear regression-based studies discuss this shortcoming. With respect to the the impact of policy, linear probability models that at times generate predictions outside the unit interval can undermine estimates thereof. For instance, treatment e↵ects reported in Card, Dobkin & Maestas (2004) are consistently larger when generated under the linear probability model than under the probit model.4 In the case of unordered categorical outcomes (the case considered in our analysis), linear and nonlinear regression models are simply inappropriate. Although it is possible to estimate separate linear regressions for every binary pair in the set of unordered cat- egorical outcomes, such an analysis ignores the risk that predictions might fail to satisfy simple axioms of probability. It also ignores the potential for choice dependence since consider local linear nonparametric regression; see below. 3Some exceptions to this include Albouy & Lequien (2009) and Ou (2010), who use non-linear probabilitymodels(i.e.non-linearregressionwithdiscreteoutcomes)withinanRDsetting,ratherthan linear models in their analyses. 4As an aside we note that for duration data, although the use of linear probability models could produce invalid probability estimates, their use in such settings is even more problematic due to the presence of ‘duration dependence’, under which individuals with similar durations are likely to have common unobserved factors a↵ecting the outcomes. Ignoring duration dependence, implicitly ignored when adopting least squares regression, could also lead to bias in the estimated treatment e↵ect. 3 the relationship between such categorical outcomes is independent of other outcomes by construction.5 In this paper, we use an RD design to examine the e↵ect of the 1994 abolition of public health sector user fees on health care-seeking behavior of ill South African chil- dren. The policy eliminated health care fees for children under the age of six, pregnant and nursing mothers, as well as the elderly. However, health care services in South Africa are provided by both the private and public sectors. Therefore, ill children can receive professional health care in either the public or the private sector, or not receive any professional care at all (home care). The e↵ect of user fee abolition on the use of public health care services is analyzed while taking into account this multinomial trio of unordered health care-seeking options (home care, private, and public). We avoid the methods outlined above, and instead estimate a parametric linear index multinomial logit model, a specification that has been used in similar settings. We then estimate a nonparametric multinomial outcome model that constructs the conditional probability directly; bothoftheseempiricalspecificationsguaranteethatbasicaxiomsofprobability are satisfied. The nonparametric model is found to fit the data better than the popular multinomial logit model in both in-sample and out-of-sample assessment. These results suggest that the linear index multinomial logit model (which is not altogether di↵erent from the linear probability model employed in the majority of previous RD studies) is inappropriate in our setting. We also construct estimates of average treatment e↵ects across the sub-population most likely to be a↵ected by the policy, i.e. the least well-o↵ from a socio-economic perspective. Themeasuredimpacts, summarizedinaseriesoffigures, indicatethatnon- constant treatment e↵ects are at work in the data. The robust nonparametric results therefore call into question the commonly maintained assumption that treatment e↵ects are constant. They also raise questions about the dominant focus in the literature on scalar estimates computed from pooled linear probability models with an RD indicator. WearecertainlynotthefirsttoadoptnonparametricmethodswithinanRDcontext. Both Hahn, Todd & van der Klaauw (2001) and Imbens & Lemieux (2008) outline non- 5This problem is qualitatively similar to violating the Independence of Irrelevant Alternatives (IIA) assumption in the multinomial logit regression setting. 4 parametric local linear regression methods and discuss practical problems associated with nonparametric regression at a boundary point, which is important when using certain nonparametric methods in an RD setting.6 Although the use of local linear regression mitigates boundary-bias problems, the local linear regression estimator (a non-linear probability model) is subject to the same critique as linear probability mod- els in multinomial choice settings. That is, in multinomial choice settings, the estimated probabilities can lie outside the unit interval and violate basic axioms of probability. Furthermore, applying this nonparametric method to each binary pair that can be de- fined in a multinomial outcome setting, especially if those outcomes cannot be ranked, wouldimplicitlyassumeindependenceofirrelevantalternatives(IIA),andthusapplying themethodwithinsuchasettingmaynotbevalid. TheIIAassumptionisnotpresumed in the nonparametric method outlined by Hall, Racine & Li (2004) that is applied be- low. Therefore the method can be applied in all categorical outcome models, binary or multinomial, ranked or unranked. In addition, we model conditional probabilities directly rather than conditional means, as is done by the local linear approach which otherwise mirrors the linear probability model.7 While we are critical of the dominant linear parametric paradigm, we intend this paper to be constructive and instructive in nature. Not only are nonparametric methods capable of revealing features of the data that are masked by rigid parametric specifications, but they also o↵er practitioners a feasible alternative to such approaches as we hope to demonstrate below. All code for the analysis undertaken in this paper is available upon request from the authors. 2 Methodology The user fee policy change announced in 1994 consisted of a number of components, including free public health care for ill children under the age of six, the elderly, as well as pregnant and nursing mothers. However, our analysis focuses only on the demand 6Their methods rely on Cheng, Fan & Marron’s (1997) triangular kernel, and have been applied by Carpenter & Dobkin (2009) and McCrary & Royer (2011). Optimal bandwidths for these estimators are outlined in Imbens & Kalyanaraman (2012). 7Asanaside,theRDmethodologyoutlinedbelowcanbeappliedinadiscretedurationdatasetting, including the case where there is duration dependence, so the generality of this approach ought to be appealing to practitioners. 5 for curative care services for children under the age of six, given that data limitations preclude consideration of preventative care, antenatal care or e↵ects related to nursing mothers. Furthermore, a number of other changes related to retirement pensions were enacted within a similar time frame. Thus, it was not possible to consider the e↵ect of the policy on the elderly.8 The demand for curative care services is analyzed within the context of health care facility choice. Gupta & Dasgupta (2002), among others, note that provider choice decisions are primarily related to curative care. The component of the South African user fee abolition policy considered here was based on an age threshold, and so the analysis will be based on the application of RD. The age data described below is generally only available in years, although it is possibletomergeexactbirthdatesfromthesurvey, allowingforamoregeneralanalysis. However, that data is available only for children living with their mothers. For that reason, we do not make use of exact birth dates in this analysis.9 As noted previously, the policy was designed to improve access to health care within thepublicsector,althoughotherhealthcare-seekingoptionsareavailableforillchildren. These options, such as care within the private sector and home care, are potential substitutes for public care. Therefore, the analysis is placed within a three-outcome model of health care facility choice. A parametric analysis of multinomial outcomes could be built on a multinomial logit or probit framework, which is where we shall begin our analysis (we report results for the logit model only, as both link functions deliversimilarresults). However,inadditiontothemultinomiallogitframework,wealso undertakenonparametricanalysisbasedondirectestimationofconditionalprobabilities for the reasons outlined earlier. Each is described, in turn, below. 8For further information about the policy and previous analyses of the policy impact, the interested reader is pointed to Koch (2012) and the citations therein. 9Analysiswithacontinuousrunningvariableisavailable,althoughthemainresultsandconclusions presented here remain una↵ected. 6 2.1 Parametric Multinomial Logit Analysis Denote by Y , with realizations y , a categorical indicator of health facility choice, which i i takes on the values j 0,1,2 , i.e. ∈{ } 0, No professional medical treatment sought (home care) ￿ Yi ￿￿￿￿￿￿￿￿￿￿1, Treatment sought at a public facility (1) =￿ ￿￿￿￿￿￿￿￿￿￿2, Treatment sought at a private facility. ￿ Furthermore, assume that there is a vector of explanatory variables, denoted by X , i which have realizations x in the data. These are assumed to represent socio-economic i and demographic characteristics of the ill child, including a function of the child’s age. Given its central role in the analysis, we will make the age function explicit a little later on in our discussion. Following convention, we define p to be the probability that ill ij child i receives treatment j, i.e. p prob Y j X x . By definition, p 1, ij i i i j ij such that parameters in the paramet=ric mod(el c=an￿onl=y be)identified relative∑to a b=ase category. Without loss of generality, j 0 (home care) will be the base category. Finally, assuming that the stochas=tic error terms are iid and follow an extreme value distribution, while the explanatory variables follow a linear index formulation, the underlying probabilities take on the familiar multinomial logit structure. The coe�cient vectors, � and � , are for outcome choices 1 and 2, respectively, and they are relative 1 2 to home care (outcome 0). That is, 1 2 pi0 1 exi�j − (2) ′ j 1 ￿ ￿ = +￿ 1 ￿ = 2 ￿ pi1 exi�1 1 exi�j − (3) ′ ′ j 1 ￿ ￿ = +￿ 1 ￿ =2 ￿ pi2 exi�2 1 exi�j − . (4) ′ ′ j 1 ￿ ￿ = +￿ ￿ = ￿ The multinomial logit model can be estimated via maximum likelihood, where, for any 7 ill child, the contribution to the log-likelihood is 2 ln � 1 y j lnp . (5) i i ij j 0 L ( )= ￿ ( = ) = In (5), the indicator function, 1 y j , assumes a value of 1 if health care choice j i is chosen for child i, and 0 othe(rwis=e. )The model is estimated using the ‘multinom’ function in the R (R Core Team 2013) package ‘nnet’ (Venables & Ripley 2002, Version 7.3-7). Underlying this structure is the IIA assumption, wherein the odds ratios derived in the model do not depend on the number of choices available. For example pi1 ex′i�1 ex′i�2 exi �1 �2 (6) pi2 1 2j 1exi�j 1 2j 1exi�j ′( − ) ′ ′ = ￿ = +∑ = +∑ = is completely independent of the base choice, and would remain so for any other choices that could be added to the set of outcomes. Although IIA is a testable assumption (see e.g. Small & Hsiao (1985)), it will not be formally tested here, given the dominant performanceoftherobustnonparametricapproach. Instead, thepredictiveperformance of the multinomial logit model will be compared to the predictive performance of the nonparametric model. The comparison is outlined below. It is also true that IIA can be relaxed in a number of di↵erent ways – for instance, through the nesting of alternatives, the allowance of random parameters, or assuming normally distributed, but correlated, stochastic error terms. We leave such analysis to the interested reader. 2.2 Nonparametric Conditional Probability Analysis Although IIA can be relaxed in a number of di↵erent ways, most of the options remain restrictive and are, at least to some degree, ad hoc. For example, nesting requires the practitioner to assume that decisions are made in groups. An analyst might be willing to assume that a caregiver first decides whether or not an ill child should be treated at a health care facility, and once that decision is made, the treatment location might be selected. However, there is no reason to believe that the presumed nesting structure is necessarily valid. Meanwhile, assuming multivariate normality imposes a 8 distribution on the error structure that may not be correct. Therefore, we also consider a consistent nonparametric estimator of the outcome probabilities rather than relying unquestioningly on the parametric multinomial logit model to obtain estimates of the respective probabilities. Begin by defining f and m as the joint and marginal densities of X,Y and X, respectively, where Y(⋅)represent(s⋅)the unordered categorical outcomes asso(ciated)with health facility choice outlined in (1), while X can include continuous, ordered and un- ordered categorical variables. The conditional probability density function of Y y, given X x, is defined by = f x,y = g yx . (7) m x ( ) ( ￿ )= An estimate of the conditional density can be(for)mulated from the kernel estimates of the underlying joint and marginal densities, fˆand mˆ. Replacing the unknown densities in (7) with their estimates yields an estimate of the conditional density of Y y, given X x, which we write as = fˆ x,y = gˆ yx . (8) mˆ x ( ) ( ￿ )= Given the mix of continuous variables, order(ed)discrete variables and unordered dis- cretevariables,Li&Racine’s(2003)generalizedproductkernelisusedintheestimation. Following Li & Racine (2003), let X Xc,Xu,Xo denote a split of X into s continu- ous, t discrete unordered and r discre=te( ordered va)riables. The marginal density m for realizations x is given by mˆ x mˆ xc,xu,xo (9) 1 n s t r ( )= ( W )Xc ,xc `u Xu,xu `o Xo ,xo . n ik k ik k ik k i 1 k 1 k 1 k 1 = ￿￿￿ ( )￿ ( )￿ ( )￿ = = = = Similarly, the joint density f for realizations x,y is given by ( ) fˆ x,y fˆ xc,xu,xo,yu (10) 1 n s t r ( )= ( W Xc),xc `u Xu,xu `o Xo ,xo `u Yu,yu . n ik k ik k ik k y i i 1 k 1 k 1 k 1 = ￿￿￿ ( )￿ ( )￿ ( )￿ ( ) = = = = Withinthestructureofequations(9)and(10),therearethreedi↵erentX datatypes 9

Description:
Academic Research Computing Network (SHARCNET:www.sharcnet.ca) marize and communicate changes in average outcomes at the RD .. this method has strong intuitive appeal for those familiar with the likelihood principle.
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.