ebook img

Post-selection Inference for Forward Stepwise and Least Angle Regression PDF

96 Pages·2014·2.45 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Post-selection Inference for Forward Stepwise and Least Angle Regression

Post-selection Inference for Forward Stepwise and Least Angle Regression Ryan & Rob Tibshirani Carnegie Mellon University & Stanford University Joint work with Jonathon Taylor, Richard Lockhart September 2014 1/45 Matching  Results   from  picadilo.com   Ryan  Tibshirani  ,                      Rob  Tibshirani   CMU.  PhD  student  of  Taylor                                Stanford   2011   2/45 ⎜   81%   71%   Ryan  Tibshirani                    Rob  Tibshirani   ë          CMU                                Stanford   Top  matches  from   picadilo.com   3/45 81%   Ryan  Tibshirani                    Rob  Tibshirani   71%   69%          CMU                                Stanford   4/45 Conclusion Confidence— the strength of evidence— matters! 5/45 Outline Setup and basic question • Quick review of least angle regression and the covariance test • A new framework for inference after selection • Application to forward stepwise and least angle regression • Application of these and related ideas to other problems • 6/45 Setup and basic question (cid:73) Given an outcome vector y Rn and a predictor matrix ∈ X Rn×p, we consider the usual linear regression setup: ∈ y = Xβ∗+σ(cid:15), where β∗ Rp are unknown coefficients to be estimated, and ∈ the components of the noise vector (cid:15) Rn are i.i.d. N(0,1) ∈ (cid:73) Main question: If we apply least angle or forward stepwise regression, how can we compute valid p-values and confidence intervals? 7/45 Forward stepwise regression (cid:73) This procedure enters predictors one a time, choosing the predictor that most decreases the residual sum of squares at each stage. (cid:73) Defining RSS to be the residual sum of squares for the model containing k predictors, and RSS the residual sum of null squares before the kth predictor was added, we can form the usual statistic R = (RSS RSS)/σ2 k null − (with σ assumed known), and compare it to a χ2 distribution. 1 8/45 Simulated example: naive forward stepwise Setup: n = 100,p = 10, true model null 5 l 1 l Test statistic 510 lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll 0 0 2 4 6 8 10 Chi−squared on 1 df Test is too liberal: for nominal size 5%, actual type I error is 39%. (Yes, Larry, can get proper p-values by sample splitting: but messy, loss of power) 9/45 Quick review of LAR and the covariance test Least angle regression or LAR is a method for constructing the path of solutions for the lasso: (cid:88) (cid:88) (cid:88) min (y β x β )2+λ β i 0 ij j j β0,βj i − − j · j | | LAR is a more democratic version of forward stepwise regression. (cid:73) Find the predictor most correlated with the outcome (cid:73) Move the parameter vector in the least squares direction until some other predictor has as much correlation with the current residual (cid:73) This new predictor is added to the active set, and the procedure is repeated (cid:73) Optional (“lasso mode”): if a non-zero coefficient hits zero, that predictor is dropped from the active set, and the process is restarted 10/45

Description:
Outline. • Setup and basic question. • Quick review of least angle regression and the covariance test. • A new framework for inference after selection.
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.